Abstract
Human decision-making often involves combining similar states into categoriesand reasoning at the level of the categories rather than the actual states.Guided by this intuition, we propose a novel method for clustering statefeatures in deep reinforcement learning (RL) methods to improve theirinterpretability. Specifically, we propose a plug-and-play framework termed\emph{vector quantized reinforcement learning} (VQ-RL) that extends classic RLpipelines with an auxiliary classification task based on vector quantized (VQ)encoding and aligns with policy training. The VQ encoding method categorizesfeatures with similar semantics into clusters and results in tighter clusterswith better separation compared to classic deep RL methods, thus enablingneural models to learn similarities and differences between states better.Furthermore, we introduce two regularization methods to help increase theseparation between clusters and avoid the risks associated with VQ training. Insimulations, we demonstrate that VQ-RL improves interpretability andinvestigate its impact on robustness and generalization of deep RL.