Show Your Work: Scratchpads for Intermediate Computation with Language Models

  • 2021-11-30 21:32:46
  • Maxwell Nye, Anders Johan Andreassen, Guy Gur-Ari, Henryk Michalewski, Jacob Austin, David Bieber, David Dohan, Aitor Lewkowycz, Maarten Bosma, David Luan, Charles Sutton, Augustus Odena
  • 173

Abstract

Large pre-trained language models perform remarkably well on tasks that canbe done "in one pass", such as generating realistic text or synthesizingcomputer programs. However, they struggle with tasks that require unboundedmulti-step computation, such as adding integers or executing programs.Surprisingly, we find that these same models are able to perform complexmulti-step computations -- even in the few-shot regime -- when asked to performthe operation "step by step", showing the results of intermediate computations.In particular, we train transformers to perform multi-step computations byasking them to emit intermediate computation steps into a "scratchpad". On aseries of increasingly complex tasks ranging from long addition to theexecution of arbitrary programs, we show that scratchpads dramatically improvethe ability of language models to perform multi-step computations.

 

Quick Read (beta)

loading the full paper ...