Abstract
Reinforcement learning (RL) has a rich history in neuroscience, from earlywork on dopamine as a reward prediction error signal (Schultz et al., 1997) torecent work proposing that the brain could implement a form of 'distributionalreinforcement learning' popularized in machine learning (Dabney et al., 2020).There has been a close link between theoretical advances in reinforcementlearning and neuroscience experiments throughout this literature, and thetheories describing the experimental data have therefore become increasinglycomplex. Here, we provide an introduction and mathematical background to manyof the methods that have been used in systems neroscience. We start with anoverview of the RL problem and classical temporal difference algorithms,followed by a discussion of 'model-free', 'model-based', and intermediate RLalgorithms. We then introduce deep reinforcement learning and discuss how thisframework has led to new insights in neuroscience. This includes a particularfocus on meta-reinforcement learning (Wang et al., 2018) and distributional RL(Dabney et al., 2020). Finally, we discuss potential shortcomings of the RLformalism for neuroscience and highlight open questions in the field. Code thatimplements the methods discussed and generates the figures is also provided.