Abstract
To equip robots with dexterous skills, an effective approach is to firsttransfer the desired skill via Learning from Demonstration (LfD), then let therobot improve it by self-exploration via Reinforcement Learning (RL). In thispaper, we propose a novel LfD+RL framework, namely Adaptive Conditional NeuralMovement Primitives (ACNMP), that allows efficient policy improvement in novelenvironments and effective skill transfer between different agents. This isachieved through exploiting the latent representation learned by the underlyingConditional Neural Process (CNP) model, and simultaneous training of the modelwith supervised learning (SL) for acquiring the demonstrated trajectories andvia RL for new trajectory discovery. Through simulation experiments, we showthat (i) ACNMP enables the system to extrapolate to situations where pure LfDfails; (ii) Simultaneous training of the system through SL and RL preserves theshape of demonstrations while adapting to novel situations due to the sharedrepresentations used by both learners; (iii) ACNMP enables order-of-magnitudesample-efficient RL in extrapolation of reaching tasks compared to the existingapproaches; (iv) ACNMPs can be used to implement skill transfer between robotshaving different morphology, with competitive learning speeds and importantlywith less number of assumptions compared to the state-of-the-art approaches.Finally, we show the real-world suitability of ACNMPs through real robotexperiments that involve obstacle avoidance, pick and place and pouringactions.