Using Logical Specifications of Objectives in Multi-Objective Reinforcement Learning

Abstract

It is notoriously difficult to control the behavior of reinforcement learningagents. Agents often learn to exploit the environment or reward signal and needto be retrained multiple times. The multi-objective reinforcement learning(MORL) framework separates a reward function into several objectives. An idealMORL agent learns to generalize to novel combinations of objectives allowingfor better control of an agent's behavior without requiring retraining. ManyMORL approaches use a weight vector to parameterize the importance of eachobjective. However, this approach suffers from lack of expressiveness andinterpretability. We propose using propositional logic to specify theimportance of multiple objectives. By using a logic where predicates corresponddirectly to objectives, specifications are inherently more interpretable.Additionally the set of specifications that can be expressed with formallanguages is a superset of what can be expressed by weight vectors. In thispaper, we define a formal language based on propositional logic withquantitative semantics. We encode logical specifications using a recurrentneural network and show that MORL agents parameterized by these encodings areable to generalize to novel specifications over objectives and achieveperformance comparable to single objective baselines.

Quick Read (beta)

loading the full paper ...