Scaling Scaling Laws with Board Games

Abstract

The largest experiments in machine learning now require resources far beyondthe budget of all but a few institutions. Fortunately, it has recently beenshown that the results of these huge experiments can often be extrapolated fromthe results of a sequence of far smaller, cheaper experiments. In this work, weshow that not only can the extrapolation be done based on the size of themodel, but on the size of the problem as well. By conducting a sequence ofexperiments using AlphaZero and Hex, we show that the performance achievablewith a fixed amount of compute degrades predictably as the game gets larger andharder. Along with our main result, we further show that increasing thetest-time compute available to an agent can substitute for reduced train-timecompute, and vice versa.

Quick Read (beta)

loading the full paper ...