Abstracting Geo-specific Terrains to Scale Up Reinforcement Learning

Abstract

Multi-agent reinforcement learning (MARL) is increasingly ubiquitous intraining dynamic and adaptive synthetic characters for interactive simulationson geo-specific terrains. Frameworks such as Unity's ML-Agents help to makesuch reinforcement learning experiments more accessible to the simulationcommunity. Military training simulations also benefit from advances in MARL,but they have immense computational requirements due to their complex,continuous, stochastic, partially observable, non-stationary, anddoctrine-based nature. Furthermore, these simulations require geo-specificterrains, further exacerbating the computational resources problem. In ourresearch, we leverage Unity's waypoints to automatically generate multi-layeredrepresentation abstractions of the geo-specific terrains to scale upreinforcement learning while still allowing the transfer of learned policiesbetween different representations. Our early exploratory results on a novelMARL scenario, where each side has differing objectives, indicate thatwaypoint-based navigation enables faster and more efficient learning whileproducing trajectories similar to those taken by expert human players in CSGOgaming environments. This research points out the potential of waypoint-basednavigation for reducing the computational costs of developing and training MARLmodels for military training simulations, where geo-specific terrains anddiffering objectives are crucial.

Quick Read (beta)

loading the full paper ...