Cloud-Edge Training Architecture for Sim-to-Real Deep Reinforcement Learning

Abstract

Deep reinforcement learning (DRL) is a promising approach to solve complexcontrol tasks by learning policies through interactions with the environment.However, the training of DRL policies requires large amounts of trainingexperiences, making it impractical to learn the policy directly on physicalsystems. Sim-to-real approaches leverage simulations to pretrain DRL policiesand then deploy them in the real world. Unfortunately, the direct real-worlddeployment of pretrained policies usually suffers from performancedeterioration due to the different dynamics, known as the reality gap. Recentsim-to-real methods, such as domain randomization and domain adaptation, focuson improving the robustness of the pretrained agents. Nevertheless, thesimulation-trained policies often need to be tuned with real-world data toreach optimal performance, which is challenging due to the high cost ofreal-world samples. This work proposes a distributed cloud-edge architecture to train DRL agentsin the real world in real-time. In the architecture, the inference and trainingare assigned to the edge and cloud, separating the real-time control loop fromthe computationally expensive training loop. To overcome the reality gap, ourarchitecture exploits sim-to-real transfer strategies to continue the trainingof simulation-pretrained agents on a physical system. We demonstrate itsapplicability on a physical inverted-pendulum control system, analyzingcritical parameters. The real-world experiments show that our architecture canadapt the pretrained DRL agents to unseen dynamics consistently andefficiently.

Quick Read (beta)

loading the full paper ...