Variational Meta Reinforcement Learning for Social Robotics

Abstract

With the increasing presence of robots in our every-day environments,improving their social skills is of utmost importance. Nonetheless, socialrobotics still faces many challenges. One bottleneck is that robotic behaviorsneed to be often adapted as social norms depend strongly on the environment.For example, a robot should navigate more carefully around patients in ahospital compared to workers in an office. In this work, we investigatemeta-reinforcement learning (meta-RL) as a potential solution. Here, robotbehaviors are learned via reinforcement learning where a reward function needsto be chosen so that the robot learns an appropriate behavior for a givenenvironment. We propose to use a variational meta-RL procedure that quicklyadapts the robots' behavior to new reward functions. As a result, given a newenvironment different reward functions can be quickly evaluated and anappropriate one selected. The procedure learns a vectorized representation forreward functions and a meta-policy that can be conditioned on such arepresentation. Given observations from a new reward function, the procedureidentifies its representation and conditions the meta-policy to it. Whileinvestigating the procedures' capabilities, we realized that it suffers fromposterior collapse where only a subset of the dimensions in the representationencode useful information resulting in a reduced performance. Our secondcontribution, a radial basis function (RBF) layer, partially mitigates thisnegative effect. The RBF layer lifts the representation to a higher dimensionalspace, which is more easily exploitable for the meta-policy. We demonstrate theinterest of the RBF layer and the usage of meta-RL for social robotics on fourrobotic simulation tasks.

Quick Read (beta)

loading the full paper ...