Avoiding Catastrophe in Online Learning by Asking for Help

  • 2024-10-04 16:23:43
  • Benjamin Plaut, Hanlin Zhu, Stuart Russell
  • 0

Abstract

Most learning algorithms with formal regret guarantees assume that no mistakeis irreparable and essentially rely on trying all possible behaviors. Thisapproach is problematic when some mistakes are \emph{catastrophic}, i.e.,irreparable. We propose an online learning problem where the goal is tominimize the chance of catastrophe. Specifically, we assume that the payoff ineach round represents the chance of avoiding catastrophe that round and aim tomaximize the product of payoffs (the overall chance of avoiding catastrophe)while allowing a limited number of queries to a mentor. We first show that ingeneral, any algorithm either constantly queries the mentor or is nearlyguaranteed to cause catastrophe. However, in settings where the mentor policyclass is learnable in the standard online learning model, we provide analgorithm whose regret and rate of querying the mentor both approach 0 as thetime horizon grows. Conceptually, if a policy class is learnable in the absenceof catastrophic risk, it is learnable in the presence of catastrophic risk ifthe agent can ask for help.

 

Quick Read (beta)

loading the full paper ...