Efficient Learning in Chinese Checkers: Comparing Parameter Sharing in Multi-Agent Reinforcement Learning

Abstract

We show that multi-agent reinforcement learning (MARL) with full parametersharing outperforms independent and partially shared architectures in thecompetitive perfect-information homogenous game of Chinese Checkers. To run ourexperiments, we develop a new MARL environment: variable-size, six-playerChinese Checkers. This custom environment was developed in PettingZoo andsupports all traditional rules of the game including chaining jumps. This is,to the best of our knowledge, the first implementation of Chinese Checkers thatremains faithful to the true game. Chinese Checkers is difficult to learn due to its large branching factor andpotentially infinite horizons. We borrow the concept of branching actions(submoves) from complex action spaces in other RL domains, where a submove maynot end a player's turn immediately. This drastically reduces thedimensionality of the action space. Our observation space is inspired byAlphaGo with many binary game boards stacked in a 3D array to encodeinformation. The PettingZoo environment, training and evaluation logic, and analysisscripts can be found on\href{https://github.com/noahadhikari/pettingzoo-chinese-checkers}{Github}.

Quick Read (beta)

loading the full paper ...