Learning to reinforcement learn for Neural Architecture Search

Abstract

Reinforcement learning (RL) is a goal-oriented learning solution that hasproven to be successful for Neural Architecture Search (NAS) on the CIFAR andImageNet datasets. However, a limitation of this approach is its highcomputational cost, making it unfeasible to replay it on other datasets.Through meta-learning, we could bring this cost down by adapting previouslylearned policies instead of learning them from scratch. In this work, wepropose a deep meta-RL algorithm that learns an adaptive policy over a set ofenvironments, making it possible to transfer it to previously unseen tasks. Thealgorithm was applied to various proof-of-concept environments in the past, butwe adapt it to the NAS problem. We empirically investigate the agent's behaviorduring training when challenged to design chain-structured neural architecturesfor three datasets with increasing levels of hardness, to later fix the policyand evaluate it on two unseen datasets of different difficulty. Our resultsshow that, under resource constraints, the agent effectively adapts itsstrategy during training to design better architectures than the ones designedby a standard RL algorithm, and can design good architectures during theevaluation on previously unseen environments. We also provide guidelines on theapplicability of our framework in a more complex NAS setting by studying theprogress of the agent when challenged to design multi-branch architectures.

Quick Read (beta)

loading the full paper ...