Constraint-Guided Reinforcement Learning: Augmenting the Agent-Environment-Interaction

Abstract

Reinforcement Learning (RL) agents have great successes in solving tasks withlarge observation and action spaces from limited feedback. Still, training theagents is data-intensive and there are no guarantees that the learned behavioris safe and does not violate rules of the environment, which has limitationsfor the practical deployment in real-world scenarios. This paper discusses theengineering of reliable agents via the integration of deep RL withconstraint-based augmentation models to guide the RL agent towards safebehavior. Within the constraints set, the RL agent is free to adapt andexplore, such that its effectiveness to solve the given problem is nothindered. However, once the RL agent leaves the space defined by theconstraints, the outside models can provide guidance to still work reliably. Wediscuss integration points for constraint guidance within the RL process andperform experiments on two case studies: a strictly constrained card game and agrid world environment with additional combinatorial subgoals. Our results showthat constraint-guidance does both provide reliability improvements and saferbehavior, as well as accelerated training.

Quick Read (beta)

loading the full paper ...