Co-Activation Graph Analysis of Safety-Verified and Explainable Deep Reinforcement Learning Policies

  • 2025-01-06 17:07:44
  • Dennis Gross, Helge Spieker
  • 0

Abstract

Deep reinforcement learning (RL) policies can demonstrate unsafe behaviorsand are challenging to interpret. To address these challenges, we combine RLpolicy model checking--a technique for determining whether RL policies exhibitunsafe behaviors--with co-activation graph analysis--a method that maps neuralnetwork inner workings by analyzing neuron activation patterns--to gain insightinto the safe RL policy's sequential decision-making. This combination lets usinterpret the RL policy's inner workings for safe decision-making. Wedemonstrate its applicability in various experiments.

 

Quick Read (beta)

loading the full paper ...