Locally Differentially Private Reinforcement Learning for Linear Mixture Markov Decision Processes

  • 2021-10-19 17:44:09
  • Chonghua Liao, Jiafan He, Quanquan Gu
  • 33

Abstract

Reinforcement learning (RL) algorithms can be used to provide personalizedservices, which rely on users' private and sensitive data. To protect theusers' privacy, privacy-preserving RL algorithms are in demand. In this paper,we study RL with linear function approximation and local differential privacy(LDP) guarantees. We propose a novel $(\varepsilon, \delta)$-LDP algorithm forlearning a class of Markov decision processes (MDPs) dubbed linear mixtureMDPs, and obtains an $\tilde{\mathcal{O}}(d^{5/4}H^{7/4}T^{3/4}\left(\log(1/\delta)\right)^{1/4}\sqrt{1/\varepsilon})$regret, where $d$ is the dimension of feature mapping, $H$ is the length of theplanning horizon, and $T$ is the number of interactions with the environment.We also prove a lower bound$\Omega(dH\sqrt{T}/\left(e^{\varepsilon}(e^{\varepsilon}-1)\right))$ forlearning linear mixture MDPs under $\varepsilon$-LDP constraint. Experiments onsynthetic datasets verify the effectiveness of our algorithm. To the best ofour knowledge, this is the first provable privacy-preserving RL algorithm withlinear function approximation.

 

Quick Read (beta)

loading the full paper ...