Abstract
Language models are increasingly being deployed for general problem solvingacross a wide range of tasks, but are still confined to token-level,left-to-right decision-making processes during inference. This means they canfall short in tasks that require exploration, strategic lookahead, or whereinitial decisions play a pivotal role. To surmount these challenges, weintroduce a new framework for language model inference, Tree of Thoughts (ToT),which generalizes over the popular Chain of Thought approach to promptinglanguage models, and enables exploration over coherent units of text (thoughts)that serve as intermediate steps toward problem solving. ToT allows LMs toperform deliberate decision making by considering multiple different reasoningpaths and self-evaluating choices to decide the next course of action, as wellas looking ahead or backtracking when necessary to make global choices. Ourexperiments show that ToT significantly enhances language models'problem-solving abilities on three novel tasks requiring non-trivial planningor search: Game of 24, Creative Writing, and Mini Crosswords. For instance, inGame of 24, while GPT-4 with chain-of-thought prompting only solved 4% oftasks, our method achieved a success rate of 74%. Code repo with all prompts:https://github.com/ysymyth/tree-of-thought-llm.