Monte Carlo Q-learning for General Game Playing

  • 2018-05-21 16:16:27
  • Hui Wang, Michael Emmerich, Aske Plaat
  • 0

Abstract

After the recent groundbreaking results of AlphaGo, we have seen a stronginterest in reinforcement learning in game playing. General Game Playing (GGP)provides a good testbed for reinforcement learning. In GGP, a specification ofgames rules is given. GGP problems can be solved by reinforcement learning.Q-learning is one of the canonical reinforcement learning methods, and has beenused by (Banerjee & Stone, IJCAI 2007) in GGP. In this paper we implementQ-learning in GGP for three small-board games (Tic-Tac-Toe, Connect Four, Hex),to allow comparison to Banerjee et al. As expected, Q-learning converges,although much slower than MCTS. Borrowing an idea from MCTS, we enhanceQ-learning with Monte Carlo Search, to give QM-learning. This enhancementimproves the performance of pure Q-learning. We believe that QM-learning canalso be used to improve performance of reinforcement learning further forlarger games, something which we will test in future work.

 

Quick Read (beta)

loading the full paper ...