Implicit Policy for Reinforcement Learning

  • 2019-02-03 16:26:40
  • Yunhao Tang, Shipra Agrawal
We introduce Implicit Policy, a general class of expressive policies that canflexibly represent complex action distributions in reinforcement learning, withefficient algorithms to compute entropy regularized policy gradients. Weempirically show that, despite its simplicity in implementation, entropyregularization combined with a rich policy class can attain desirableproperties displayed under maximum entropy reinforcement learning framework,such as robustness and multi-modality.


