Prioritized Sweeping Neural DynaQ with Multiple Predecessors, and Hippocampal Replays

  • 2018-08-13 12:27:55
  • Lise Aubin, Mehdi Khamassi, BenoĆ®t Girard
  • 0

Abstract

During sleep and awake rest, the hippocampus replays sequences of place cellsthat have been activated during prior experiences. These have been interpretedas a memory consolidation process, but recent results suggest a possibleinterpretation in terms of reinforcement learning. The Dyna reinforcementlearning algorithms use off-line replays to improve learning. Under limitedreplay budget, a prioritized sweeping approach, which requires a model of thetransitions to the predecessors, can be used to improve performance. Weinvestigate whether such algorithms can explain the experimentally observedreplays. We propose a neural network version of prioritized sweepingQ-learning, for which we developed a growing multiple expert algorithm, able tocope with multiple predecessors. The resulting architecture is able to improvethe learning of simulated agents confronted to a navigation task. We predictthat, in animals, learning the world model should occur during rest periods,and that the corresponding replays should be shuffled.

 

Quick Read (beta)

loading the full paper ...