Abstract
The advancement of Offline Reinforcement Learning (RL) and OfflineMulti-Agent Reinforcement Learning (MARL) critically depends on theavailability of high-quality, pre-collected offline datasets that representreal-world complexities and practical applications. However, existing datasetsoften fall short in their simplicity and lack of realism. To address this gap,we propose Hokoff, a comprehensive set of pre-collected datasets that coversboth offline RL and offline MARL, accompanied by a robust framework, tofacilitate further research. This data is derived from Honor of Kings, arecognized Multiplayer Online Battle Arena (MOBA) game known for its intricatenature, closely resembling real-life situations. Utilizing this framework, webenchmark a variety of offline RL and offline MARL algorithms. We alsointroduce a novel baseline algorithm tailored for the inherent hierarchicalaction space of the game. We reveal the incompetency of current offline RLapproaches in handling task complexity, generalization and multi-task learning.