Monte Carlo Q-learning for General Game Playing

  • 2018-02-16 14:18:46
  • Hui Wang, Michael Emmerich, Aske Plaat
  • 3

Abstract

Recently, the interest in reinforcement learning in game playing has beenrenewed. This is evidenced by the groundbreaking results achieved by AlphaGo.General Game Playing (GGP) provides a good testbed for reinforcement learning,currently one of the hottest fields of AI. In GGP, a specification of gamesrules is given. The description specifies a reinforcement learning problem,leaving programs to find strategies for playing well. Q-learning is one of thecanonical reinforcement learning methods, which is used as baseline on someprevious work (Banerjee & Stone, IJCAI 2007). We implement Q-learning in GGPfor three small board games (Tic-Tac-Toe, Connect-Four, Hex). We find thatQ-learning converges, and thus that this general reinforcement learning methodis indeed applicable to General Game Playing. However, convergence is slow, incomparison to MCTS (a reinforcement learning method reported to achieve goodresults). We enhance Q-learning with Monte Carlo Search. This enhancementimproves performance of pure Q-learning, although it does not yet out-performMCTS. Future work is needed into the relation between MCTS and Q-learning, andon larger problem instances.

 

Quick Read (beta)

loading the full paper ...