Active Reinforcement Learning under Limited Visual Observability

Abstract

In this work, we investigate Active Reinforcement Learning (Active-RL), wherean embodied agent simultaneously learns action policy for the task while alsocontrolling its visual observations in partially observable environments. Wedenote the former as motor policy and the latter as sensory policy. Forexample, humans solve real world tasks by hand manipulation (motor policy)together with eye movements (sensory policy). Active-RL poses challenges oncoordinating two policies given their mutual influence. We propose SUGARL,Sensorimotor Understanding Guided Active Reinforcement Learning, a frameworkthat models motor and sensory policies separately, but jointly learns themusing with an intrinsic sensorimotor reward. This learnable reward is assignedby sensorimotor reward module, incentivizes the sensory policy to selectobservations that are optimal to infer its own motor action, inspired by thesensorimotor stage of humans. Through a series of experiments, we show theeffectiveness of our method across a range of observability conditions and itsadaptability to existed RL algorithms. The sensory policies learned through ourmethod are observed to exhibit effective active vision strategies.

Quick Read (beta)

loading the full paper ...