Data Shifts Hurt CoT: A Theoretical Study

Abstract

Chain of Thought (CoT) has been applied to various large language models(LLMs) and proven to be effective in improving the quality of outputs. Inrecent studies, transformers are proven to have absolute upper bounds in termsof expressive power, and consequently, they cannot solve many computationallydifficult problems. However, empowered by CoT, transformers are proven to beable to solve some difficult problems effectively, such as the $k$-parityproblem. Nevertheless, those works rely on two imperative assumptions: (1)identical training and testing distribution, and (2) corruption-free trainingdata with correct reasoning steps. However, in the real world, theseassumptions do not always hold. Although the risks of data shifts have caughtattention, our work is the first to rigorously study the exact harm caused bysuch shifts to the best of our knowledge. Focusing on the $k$-parity problem,in this work we investigate the joint impact of two types of data shifts: thedistribution shifts and data poisoning, on the quality of trained modelsobtained by a well-established CoT decomposition. In addition to revealing asurprising phenomenon that CoT leads to worse performance on learning paritythan directly generating the prediction, our technical results also give arigorous and comprehensive explanation of the mechanistic reasons of suchimpact.

Quick Read (beta)

loading the full paper ...