Locally Private Distributed Reinforcement Learning

  • 2020-01-31 09:03:23
  • Hajime Ono, Tsubasa Takahashi
  • 3


We study locally differentially private algorithms for reinforcement learningto obtain a robust policy that performs well across distributed privateenvironments. Our algorithm protects the information of local agents' modelsfrom being exploited by adversarial reverse engineering. Since a local policyis strongly being affected by the individual environment, the output of theagent may release the private information unconsciously. In our proposedalgorithm, local agents update the model in their environments and report noisygradients designed to satisfy local differential privacy (LDP) that gives arigorous local privacy guarantee. By utilizing a set of reported noisygradients, a central aggregator updates its model and delivers it to differentlocal agents. In our empirical evaluation, we demonstrate how our methodperforms well under LDP. To the best of our knowledge, this is the first workthat actualizes distributed reinforcement learning under LDP. This work enablesus to obtain a robust agent that performs well across distributed privateenvironments.


