Reinforcement Learning and Inverse Reinforcement Learning with System 1 and System 2

Abstract

Inferring a person's goal from their behavior is an important problem inapplications of AI (e.g. automated assistants, recommender systems). Theworkhorse model for this task is the rational actor model - this amounts toassuming that people have stable reward functions, discount the futureexponentially, and construct optimal plans. Under the rational actor assumptiontechniques such as inverse reinforcement learning (IRL) can be used to infer aperson's goals from their actions. A competing model is the dual-system model.Here decisions are the result of an interplay between a fast, automatic,heuristic-based system 1 and a slower, deliberate, calculating system 2. Wegeneralize the dual system framework to the case of Markov decision problemsand show how to compute optimal plans for dual-system agents. We show thatdual-system agents exhibit behaviors that are incompatible with rational actorassumption. We show that naive applications of rational-actor IRL to thebehavior of dual-system agents can generate wrong inference about the agents'goals and suggest interventions that actually reduce the agent's overallutility. Finally, we adapt a simple IRL algorithm to correctly infer the goalsof dual system decision-makers. This allows us to make interventions that help,rather than hinder, the dual-system agent's ability to reach their true goals.

Quick Read (beta)

loading the full paper ...