Learning to Reason via Mixture-of-Thought for Logical Reasoning

Abstract

Human beings naturally utilize multiple reasoning modalities to learn andsolve logical problems, i.e., different representational formats such asnatural language, code, and symbolic logic. In contrast, most existingLLM-based approaches operate with a single reasoning modality during training,typically natural language. Although some methods explored modality selectionor augmentation at inference time, the training process remains modality-blind,limiting synergy among modalities. To fill in this gap, we proposeMixture-of-Thought (MoT), a framework that enables LLMs to reason across threecomplementary modalities: natural language, code, and a newly introducedsymbolic modality, truth-table, which systematically enumerates logical casesand partially mitigates key failure modes in natural language reasoning. MoTadopts a two-phase design: (1) self-evolving MoT training, which jointly learnsfrom filtered, self-generated rationales across modalities; and (2) MoTinference, which fully leverages the synergy of three modalities to producebetter predictions. Experiments on logical reasoning benchmarks including FOLIOand ProofWriter demonstrate that our MoT framework consistently andsignificantly outperforms strong LLM baselines with single-modalitychain-of-thought approaches, achieving up to +11.7pp average accuracy gain.Further analyses show that our MoT framework benefits both training andinference stages; that it is particularly effective on harder logical reasoningproblems; and that different modalities contribute complementary strengths,with truth-table reasoning helping to overcome key bottlenecks in naturallanguage inference.

Quick Read (beta)

loading the full paper ...