Global Convergence Guarantees for Federated Policy Gradient Methods with Adversaries

Abstract

Federated Reinforcement Learning (FRL) allows multiple agents tocollaboratively build a decision making policy without sharing rawtrajectories. However, if a small fraction of these agents are adversarial, itcan lead to catastrophic results. We propose a policy gradient based approachthat is robust to adversarial agents which can send arbitrary values to theserver. Under this setting, our results form the first global convergenceguarantees with general parametrization. These results demonstrate resiliencewith adversaries, while achieving optimal sample complexity of order$\tilde{\mathcal{O}}\left( \frac{1}{N\epsilon^2} \left( 1+\frac{f^2}{N}\right)\right)$, where $N$ is the total number of agents and$f<N/2$ is the number of adversarial agents.

Quick Read (beta)

loading the full paper ...