DecompSR: A dataset for decomposed analyses of compositional multihop spatial reasoning

  • 2025-11-04 14:57:11
  • Lachlan McPheat, Navdeep Kaur, Robert Blackwell, Alessandra Russo, Anthony G. Cohn, Pranava Madhyastha
  • 0

Abstract

We introduce DecompSR, decomposed spatial reasoning, a large benchmarkdataset (over 5m datapoints) and generation framework designed to analysecompositional spatial reasoning ability. The generation of DecompSR allowsusers to independently vary several aspects of compositionality, namely:productivity (reasoning depth), substitutivity (entity and linguisticvariability), overgeneralisation (input order, distractors) and systematicity(novel linguistic elements). DecompSR is built procedurally in a manner whichmakes it is correct by construction, which is independently verified using asymbolic solver to guarantee the correctness of the dataset. DecompSR iscomprehensively benchmarked across a host of Large Language Models (LLMs) wherewe show that LLMs struggle with productive and systematic generalisation inspatial reasoning tasks whereas they are more robust to linguistic variation.DecompSR provides a provably correct and rigorous benchmarking dataset with anovel ability to independently vary the degrees of several key aspects ofcompositionality, allowing for robust and fine-grained probing of thecompositional reasoning abilities of LLMs.

 

Quick Read (beta)

loading the full paper ...