Gradient Agreement as an Optimization Objective for Meta-Learning

Abstract

This paper presents a novel optimization method for maximizing generalizationover tasks in meta-learning. The goal of meta-learning is to learn a model foran agent adapting rapidly when presented with previously unseen tasks. Tasksare sampled from a specific distribution which is assumed to be similar forboth seen and unseen tasks. We focus on a family of meta-learning methodslearning initial parameters of a base model which can be fine-tuned quickly ona new task, by few gradient steps (MAML). Our approach is based on pushing theparameters of the model to a direction in which tasks have more agreement upon.If the gradients of a task agree with the parameters update vector, then theirinner product will be a large positive value. As a result, given a batch oftasks to be optimized for, we associate a positive (negative) weight to theloss function of a task, if the inner product between its gradients and theaverage of the gradients of all tasks in the batch is a positive (negative)value. Therefore, the degree of the contribution of a task to the parameterupdates is controlled by introducing a set of weights on the loss function ofthe tasks. Our method can be easily integrated with the current meta-learningalgorithms for neural networks. Our experiments demonstrate that it yieldsmodels with better generalization compared to MAML and Reptile.

Quick Read (beta)

loading the full paper ...