Abstract
Closed drafting or "pick and pass" is a popular game mechanic where eachround players select a card or other playable element from their hand and passthe rest to the next player. In this paper, we establish first-principlemethods for studying the interpretability, generalizability, and memory of DeepQ-Network (DQN) models playing closed drafting games. In particular, we use apopular family of closed drafting games called "Sushi Go Party", in which weachieve state-of-the-art performance. We fit decision rules to interpret thedecision-making strategy of trained DRL agents by comparing them to the rankingpreferences of different types of human players. As Sushi Go Party can beexpressed as a set of closely-related games based on the set of cards in play,we quantify the generalizability of DRL models trained on various sets ofcards, establishing a method to benchmark agent performance as a function ofenvironment unfamiliarity. Using the explicitly calculable memory of otherplayer's hands in closed drafting games, we create measures of the ability ofDRL models to learn memory.