Reverse Training to Nurse the Reversal Curse

Abstract

Large language models (LLMs) have a surprising failure: when trained on "Ahas a feature B", they do not generalize to "B is a feature of A", which istermed the Reversal Curse. Even when training with trillions of tokens thisissue still appears due to Zipf's law - hence even if we train on the entireinternet. This work proposes an alternative training scheme, called reversetraining, whereby all words are used twice, doubling the amount of availabletokens. The LLM is trained in both forward and reverse directions by reversingthe training strings while preserving (i.e., not reversing) chosen substrings,such as entities. We show that data-matched reverse-trained models providesuperior performance to standard models on standard tasks, and compute-matchedreverse-trained models provide far superior performance on reversal tasks,helping resolve the reversal curse issue.

Quick Read (beta)

loading the full paper ...