Non-myopic Generation of Language Model for Reasoning and Planning

  • 2024-10-23 08:02:09
  • Chang Ma, Haiteng Zhao, Junlei Zhang, Junxian He, Lingpeng Kong
  • 0

Abstract

Large Language Models have demonstrated remarkable abilities in reasoning andplanning by breaking down complex problems into sequential steps. Despite theirsuccess in various domains like mathematical problem-solving and coding, LLMsface challenges in ensuring reliable and optimal planning due to their inherentmyopic nature of autoregressive decoding. This paper revisits LLM reasoningfrom an optimal-control perspective, proposing a novel method,Predictive-Decoding, that leverages Model Predictive Control to enhanceplanning accuracy. By re-weighting LLM distributions based on foresighttrajectories, Predictive-Decoding aims to mitigate early errors and promotenon-myopic planning. Our experiments show significant improvements in a widerange of tasks for math, coding, and agents. Furthermore, Predictive-Decodingdemonstrates computational efficiency, outperforming search baselines withreduced computational resources. This study provides insights into optimizingLLM planning capabilities.

 

Quick Read (beta)

loading the full paper ...