Abstract
Robots need models of human behavior for both inferring human goals andpreferences, and predicting what people will do. A common model is theBoltzmann noisily-rational decision model, which assumes people approximatelyoptimize a reward function and choose trajectories in proportion to theirexponentiated reward. While this model has been successful in a variety ofrobotics domains, its roots lie in econometrics, and in modeling decisionsamong different discrete options, each with its own utility or reward. Incontrast, human trajectories lie in a continuous space, with continuous-valuedfeatures that influence the reward function. We propose that it is time torethink the Boltzmann model, and design it from the ground up to operate oversuch trajectory spaces. We introduce a model that explicitly accounts fordistances between trajectories, rather than only their rewards. Rather thaneach trajectory affecting the decision independently, similar trajectories nowaffect the decision together. We start by showing that our model betterexplains human behavior in a user study. We then analyze the implications thishas for robot inference, first in toy environments where we have ground truthand find more accurate inference, and finally for a 7DOF robot arm learningfrom user demonstrations.