AutoL2S: Auto Long-Short Reasoning for Efficient Large Language Models

Abstract

The reasoning-capable large language models (LLMs) demonstrate strongperformance on complex reasoning tasks but often suffer from overthinking,generating unnecessarily long chain-of-thought (CoT) reasoning paths for easyreasoning questions, thereby increasing inference cost and latency. Recentapproaches attempt to address this challenge by manually deciding when to applylong or short reasoning. However, they lack the flexibility to adapt CoT lengthdynamically based on question complexity. In this paper, we propose AutoLong-Short Reasoning (AutoL2S), a dynamic and model-agnostic framework thatenables LLMs to dynamically compress their generated reasoning path based onthe complexity of the reasoning question. AutoL2S enables a learned paradigm,in which LLMs themselves can decide when longer reasoning is necessary and whenshorter reasoning suffices, by training on data annotated with our proposedmethod, which includes both long and short CoT paths and a special <EASY>token. We then use <EASY> token to indicate when the model can skip generatinglengthy CoT reasoning. This proposed annotation strategy can enhance the LLMs'ability to generate shorter CoT reasoning paths with improved quality aftertraining. Extensive evaluation results show that AutoL2S reduces the length ofreasoning generation by up to 57% without compromising performance,demonstrating the effectiveness of AutoL2S for scalable and efficient LLMreasoning.

Quick Read (beta)

loading the full paper ...