Logical forms complement probability in understanding language model (and human) performance

Abstract

With the increasing interest in using large language models (LLMs) forplanning in natural language, understanding their behaviors becomes animportant research question. This work conducts a systematic investigation ofLLMs' ability to perform logical reasoning in natural language. We introduce acontrolled dataset of hypothetical and disjunctive syllogisms in propositionaland modal logic and use it as the testbed for understanding LLM performance.Our results lead to novel insights in predicting LLM behaviors: in addition tothe probability of input (Gonen et al., 2023; McCoy et al., 2024), logicalforms should be considered as important factors. In addition, we showsimilarities and discrepancies between the logical reasoning performances ofhumans and LLMs by collecting and comparing behavioral data from both.

Quick Read (beta)

loading the full paper ...