Abstract
Recently, deep reinforcement learning (DRL) has emerged as a promisingapproach for robotic control. However, the deployment of DRL in real-worldrobots is hindered by its sensitivity to environmental perturbations. Whileexisting whitebox adversarial attacks rely on local gradient information andapply uniform perturbations across all states to evaluate DRL robustness, theyfail to account for temporal dynamics and state-specific vulnerabilities. Tocombat the above challenge, we first conduct a theoretical analysis ofwhite-box attacks in DRL by establishing the adversarial victim-dynamics Markovdecision process (AVD-MDP), to derive the necessary and sufficient conditionsfor a successful attack. Based on this, we propose a selective state-awarereinforcement adversarial attack method, named STAR, to optimize perturbationstealthiness and state visitation dispersion. STAR first employs a softmask-based state-targeting mechanism to minimize redundant perturbations,enhancing stealthiness and attack effectiveness. Then, it incorporates aninformation-theoretic optimization objective to maximize mutual informationbetween perturbations, environmental states, and victim actions, ensuring adispersed state-visitation distribution that steers the victim agent intovulnerable states for maximum return reduction. Extensive experimentsdemonstrate that STAR outperforms state-of-the-art benchmarks.