Machine Theory of Mind - Paper Detail

Abstract

Theory of mind (ToM; Premack & Woodruff, 1978) broadly refers to humans'ability to represent the mental states of others, including their desires,beliefs, and intentions. We propose to train a machine to build such modelstoo. We design a Theory of Mind neural network -- a ToMnet -- which usesmeta-learning to build models of the agents it encounters, from observations oftheir behaviour alone. Through this process, it acquires a strong prior modelfor agents' behaviour, as well as the ability to bootstrap to richerpredictions about agents' characteristics and mental states using only a smallnumber of behavioural observations. We apply the ToMnet to agents behaving insimple gridworld environments, showing that it learns to model random,algorithmic, and deep reinforcement learning agents from varied populations,and that it passes classic ToM tasks such as the "Sally-Anne" test (Wimmer &Perner, 1983; Baron-Cohen et al., 1985) of recognising that others can holdfalse beliefs about the world. We argue that this system -- which autonomouslylearns how to model other agents in its world -- is an important step forwardfor developing multi-agent AI systems, for building intermediating technologyfor machine-human interaction, and for advancing the progress on interpretableAI.

Quick Read (beta)

loading the full paper ...