POPGym Arcade: Parallel Pixelated POMDPs

  • 2025-03-04 05:23:52
  • Zekang Wang, Zhe He, Edan Toledo, Steven Morad
  • 0

Abstract

We introduce POPGym Arcade, a benchmark consisting of 7 pixel-basedenvironments each with three difficulties, utilizing a single observation andaction space. Each environment offers both fully observable and partiallyobservable variants, enabling counterfactual studies on partial observability.POPGym Arcade utilizes JIT compilation on hardware accelerators to achievesubstantial speedups over CPU-bound environments. Moreover, this enablesPodracer-style architectures to further increase hardware utilization andtraining speed. We evaluate memory models on our environments using a Podracervariant of Q learning, and examine the results. Finally, we generate memorysaliency maps, uncovering how memories propagate through policies. Our libraryis available at https://github.com/bolt-research/popgym_arcade.

 

Quick Read (beta)

loading the full paper ...