Unsupervised Video Object Segmentation for Deep Reinforcement Learning

Abstract

We present a new technique for deep reinforcement learning that automaticallydetects moving objects and uses the relevant information for action selection.The detection of moving objects is done in an unsupervised way by exploitingstructure from motion. Instead of directly learning a policy from raw images,the agent first learns to detect and segment moving objects by exploiting flowinformation in video sequences. The learned representation is then used tofocus the policy of the agent on the moving objects. Over time, the agentidentifies which objects are critical for decision making and gradually buildsa policy based on relevant moving objects. This approach, which we callMotion-Oriented REinforcement Learning (MOREL), is demonstrated on a suite ofAtari games where the ability to detect moving objects reduces the amount ofinteraction needed with the environment to obtain a good policy. Furthermore,the resulting policy is more interpretable than policies that directly mapimages to actions or values with a black box neural network. We can gaininsight into the policy by inspecting the segmentation and motion of eachobject detected by the agent. This allows practitioners to confirm whether apolicy is making decisions based on sensible information.

Quick Read (beta)

loading the full paper ...