Split Q Learning: Reinforcement Learning with Two-Stream Rewards

  • 2019-11-12 19:10:05
  • Baihan Lin, Djallel Bouneffouf, Guillermo Cecchi
  • 1

Abstract

Drawing an inspiration from behavioral studies of human decision making, wepropose here a general parametric framework for a reinforcement learningproblem, which extends the standard Q-learning approach to incorporate atwo-stream framework of reward processing with biases biologically associatedwith several neurological and psychiatric conditions, including Parkinson's andAlzheimer's diseases, attention-deficit/hyperactivity disorder (ADHD),addiction, and chronic pain. For AI community, the development of agents thatreact differently to different types of rewards can enable us to understand awide spectrum of multi-agent interactions in complex real-world socioeconomicsystems. Moreover, from the behavioral modeling perspective, our parametricframework can be viewed as a first step towards a unifying computational modelcapturing reward processing abnormalities across multiple mental conditions anduser preferences in long-term recommendation systems.

 

Quick Read (beta)

loading the full paper ...