Self-Consistent Model-based Adaptation for Visual Reinforcement Learning

Abstract

Visual reinforcement learning agents typically face serious performancedeclines in real-world applications caused by visual distractions. Existingmethods rely on fine-tuning the policy's representations with hand-craftedaugmentations. In this work, we propose Self-Consistent Model-based Adaptation(SCMA), a novel method that fosters robust adaptation without modifying thepolicy. By transferring cluttered observations to clean ones with a denoisingmodel, SCMA can mitigate distractions for various policies as a plug-and-playenhancement. To optimize the denoising model in an unsupervised manner, wederive an unsupervised distribution matching objective with a theoreticalanalysis of its optimality. We further present a practical algorithm tooptimize the objective by estimating the distribution of clean observationswith a pre-trained world model. Extensive experiments on multiple visualgeneralization benchmarks and real robot data demonstrate that SCMA effectivelyboosts performance across various distractions and exhibits better sampleefficiency.

Quick Read (beta)

loading the full paper ...