Abstract
The potential of large language models (LLMs) as decision support tools isincreasingly being explored in fields such as business, engineering, andmedicine, which often face challenging tasks of decision-making underuncertainty. In this paper, we show that directly prompting LLMs on these typesof decision-making problems can yield poor results, especially as the problemcomplexity increases. To aid in these tasks, we propose DeLLMa (Decision-makingLarge Language Model assistant), a framework designed to enhancedecision-making accuracy in uncertain environments. DeLLMa involves amulti-step reasoning procedure that integrates recent best practices in scalinginference-time reasoning, drawing upon principles from decision theory andutility theory, to provide an accurate and human-auditable decision-makingprocess. We validate our procedure on multiple realistic decision-makingenvironments, demonstrating that DeLLMa can consistently enhance thedecision-making performance of leading language models, and achieve up to a 40%increase in accuracy over competing methods. Additionally, we show howperformance improves when scaling compute at test time, and carry out humanevaluations to benchmark components of DeLLMa.