Exploring and Exploiting the Inherent Efficiency within Large Reasoning Models for Self-Guided Efficiency Enhancement

Abstract

Recent advancements in large reasoning models (LRMs) have significantlyenhanced language models' capabilities in complex problem-solving by emulatinghuman-like deliberative thinking. However, these models often exhibitoverthinking (i.e., the generation of unnecessarily verbose and redundantcontent), which hinders efficiency and inflates inference cost. In this work,we explore the representational and behavioral origins of this inefficiency,revealing that LRMs inherently possess the capacity for more concise reasoning.Empirical analyses show that correct reasoning paths vary significantly inlength, and the shortest correct responses often suffice, indicating untappedefficiency potential. Exploiting these findings, we propose two lightweightmethods to enhance LRM efficiency. First, we introduce Efficiency Steering, atraining-free activation steering technique that modulates reasoning behaviorvia a single direction in the model's representation space. Second, we developSelf-Rewarded Efficiency RL, a reinforcement learning framework thatdynamically balances task accuracy and brevity by rewarding concise correctsolutions. Extensive experiments on seven LRM backbones across multiplemathematical reasoning benchmarks demonstrate that our methods significantlyreduce reasoning length while preserving or improving task performance. Ourresults highlight that reasoning efficiency can be improved by leveraging andguiding the intrinsic capabilities of existing models in a self-guided manner.

Quick Read (beta)

loading the full paper ...