On Inductive Biases in Deep Reinforcement Learning

Abstract

Many deep reinforcement learning algorithms contain inductive biases thatsculpt the agent's objective and its interface to the environment. Theseinductive biases can take many forms, including domain knowledge and pretunedhyper-parameters. In general, there is a trade-off between generality andperformance when algorithms use such biases. Stronger biases can lead to fasterlearning, but weaker biases can potentially lead to more general algorithms.This trade-off is important because inductive biases are not free; substantialeffort may be required to obtain relevant domain knowledge or to tunehyper-parameters effectively. In this paper, we re-examine severaldomain-specific components that bias the objective and the environmentalinterface of common deep reinforcement learning agents. We investigated whetherthe performance deteriorates when these components are replaced with adaptivesolutions from the literature. In our experiments, performance sometimesdecreased with the adaptive components, as one might expect when comparing tocomponents crafted for the domain, but sometimes the adaptive componentsperformed better. We investigated the main benefit of having fewerdomain-specific components, by comparing the learning performance of the twosystems on a different set of continuous control problems, without additionaltuning of either system. As hypothesized, the system with adaptive componentsperformed better on many of the new tasks.

Quick Read (beta)

loading the full paper ...