Meta-Learning with Implicit Gradients

Abstract

A core capability of intelligent systems is the ability to quickly learn newtasks by drawing on prior experience. Gradient (or optimization) basedmeta-learning has recently emerged as an effective approach for few-shotlearning. In this formulation, meta-parameters are learned in the outer loop,while task-specific models are learned in the inner-loop, by using only a smallamount of data from the current task. A key challenge in scaling theseapproaches is the need to differentiate through the inner loop learningprocess, which can impose considerable computational and memory burdens. Bydrawing upon implicit differentiation, we develop the implicit MAML algorithm,which depends only on the solution to the inner level optimization and not thepath taken by the inner loop optimizer. This effectively decouples themeta-gradient computation from the choice of inner loop optimizer. As a result,our approach is agnostic to the choice of inner loop optimizer and cangracefully handle many gradient steps without vanishing gradients or memoryconstraints. Theoretically, we prove that implicit MAML can compute accuratemeta-gradients with a memory footprint that is, up to small constant factors,no more than that which is required to compute a single inner loop gradient andat no overall increase in the total computational cost. Experimentally, we showthat these benefits of implicit MAML translate into empirical gains on few-shotimage recognition benchmarks.

Quick Read (beta)

loading the full paper ...