Abstract
Multi-agent reinforcement learning (MARL) for cyber-physical vehicle systemsusually requires a significantly long training time due to their inherentcomplexity. Furthermore, deploying the trained policies in the real worlddemands a feature-rich environment along with multiple physical embodiedagents, which may not be feasible due to monetary, physical, energy, or safetyconstraints. This work seeks to address these pain points by presenting amixed-reality (MR) digital twin (DT) framework capable of: (i) boostingtraining speeds by selectively scaling parallelized simulation workloadson-demand, and (ii) immersing the MARL policies across hybridsimulation-to-reality (sim2real) experiments. The viability and performance ofthe proposed framework are highlighted through two representative use cases,which cover cooperative as well as competitive classes of MARL problems. Westudy the effect of: (i) agent and environment parallelization on trainingtime, and (ii) systematic domain randomization on zero-shot sim2real transfer,across both case studies. Results indicate up to 76.3% reduction in trainingtime with the proposed parallelization scheme and sim2real gap as low as 2.9%using the proposed deployment method.