Abstract
Autonomous systems are increasingly expected to operate in the presence ofadversaries, though adversaries may infer sensitive information simply byobserving a system. Therefore, present a deceptive sequential decision-makingframework that not only conceals sensitive information, but actively misleadsadversaries about it. We model autonomous systems as Markov decision processes,with adversaries using inverse reinforcement learning to recover rewardfunctions. To counter them, we present three regularization strategies forpolicy synthesis problems that actively deceive an adversary about a system'sreward. ``Diversionary deception'' leads an adversary to draw any falseconclusion about the system's reward function. ``Targeted deception'' leads anadversary to draw a specific false conclusion about the system's rewardfunction. ``Equivocal deception'' leads an adversary to infer that the realreward and a false reward both explain the system's behavior. We show how eachform of deception can be implemented in policy optimization problems andanalytically bound the loss in total accumulated reward induced by deception.Next, we evaluate these developments in a multi-agent setting. We show thatdiversionary, targeted, and equivocal deception all steer the adversary tofalse beliefs while still attaining a total accumulated reward that is at least97% of its optimal, non-deceptive value.