Current image-based reinforcement learning (RL) algorithms typically operateon the whole image without performing object-level reasoning. This leads toinefficient goal sampling and ineffective reward functions. In this paper, weimprove upon previous visual self-supervised RL by incorporating object-levelreasoning and occlusion reasoning. Specifically, we use unknown objectsegmentation to ignore distractors in the scene for better reward computationand goal generation; we further enable occlusion reasoning by employing a novelauxiliary loss and training scheme. We demonstrate that our proposed algorithm,ROLL (Reinforcement learning with Object Level Learning), learns dramaticallyfaster and achieves better final performance compared with previous methods inseveral simulated visual control tasks. Project video and code are available athttps://sites.google.com/andrew.cmu.edu/roll.