Can Agents Learn by Analogy? An Inferable Model for PAC Reinforcement Learning

Abstract

Model-based reinforcement learning algorithms make decisions by building andutilizing a model of the environment. However, none of the existing algorithmsattempts to infer the dynamics of any state-action pair from known state-actionpairs before meeting it for sufficient times. We propose a new model-basedmethod called Greedy Inference Model (GIM) that infers the unknown dynamicsfrom known dynamics based on the internal spectral properties of theenvironment. In other words, GIM can "learn by analogy". We further introduce anew exploration strategy which ensures that the agent rapidly and evenly visitsunknown state-action pairs. GIM is much more computationally efficient thanstate-of-the-art model-based algorithms, as the number of dynamic programmingoperations is independent of the environment size. Lower sample complexitycould also be achieved under mild conditions compared against methods withoutinferring. Experimental results demonstrate the effectiveness and efficiency ofGIM in a variety of real-world tasks.

Quick Read (beta)

loading the full paper ...