Complex Logical Instruction Generation

  • 2025-08-12 17:54:27
  • Mian Zhang, Shujian Liu, Sixun Dong, Ming Yin, Yebowen Hu, Xun Wang, Steven Ma, Song Wang, Sathish Reddy Indurthi, Haoyun Deng, Zhiyu Zoey Chen, Kaiqiang Song
  • 0

Abstract

Instruction following has catalyzed the recent era of Large Language Models(LLMs) and is the foundational skill underpinning more advanced capabilitiessuch as reasoning and agentic behaviors. As tasks grow more challenging, thelogic structures embedded in natural language instructions becomes increasinglyintricate. However, how well LLMs perform on such logic-rich instructionsremains under-explored. We propose LogicIFGen and LogicIFEval. LogicIFGen is ascalable, automated framework for generating verifiable instructions from codefunctions, which can naturally express rich logic such as conditionals,nesting, recursion, and function calls. We further curate a collection ofcomplex code functions and use LogicIFGen to construct LogicIFEval, a benchmarkcomprising 426 verifiable logic-rich instructions. Our experiments demonstratethat current state-of-the-art LLMs still struggle to correctly follow theinstructions in LogicIFEval. Most LLMs can only follow fewer than 60% of theinstructions, revealing significant deficiencies in the instruction-followingability. Code and Benchmark: https://github.com/mianzhang/LogicIF

 

Quick Read (beta)

loading the full paper ...