Revisiting Discrete Soft Actor-Critic

  • 2022-09-22 04:15:15
  • Haibin Zhou, Zichuan Lin, Junyou Li, Deheng Ye, Qiang Fu, Wei Yang
  • 0


We study the adaption of soft actor-critic (SAC) from continuous action spaceto discrete action space. We revisit vanilla SAC and provide an in-depthunderstanding of its Q value underestimation and performance instability issueswhen applied to discrete settings. We thereby propose entropy-penalty anddouble average Q-learning with Q-clip to address these issues. Extensiveexperiments on typical benchmarks with discrete action space, including Atarigames and a large-scale MOBA game, show the efficacy of our proposed method.Our code is at:


Quick Read (beta)

loading the full paper ...