Abstract
To achieve generalized and robust natural-to-formal language conversion(N2F), large language models (LLMs) need to have strong capabilities ofdecomposition and composition in N2F when faced with an unfamiliar formallanguage and be able to cope with compositional gaps and counter-intuitivesymbolic names. To investigate whether LLMs have this set of basic capabilitiesin N2F, we propose the DEDC framework. This framework semi-automaticallyperforms sample and task construction, allowing decoupled evaluation of the setof decomposition and composition capabilities of LLMs in N2F. Based on thisframework, we evaluate and analyze the most advanced LLMs, and the mainfindings include that: (1) the LLMs are deficient in both decomposition andcomposition; (2) the LLMs show a wide coverage of error types that can beattributed to deficiencies in natural language understanding and the learningand use of symbolic systems; (3) compositional gaps and counter-intuitivesymbolic names both affect the decomposition and composition of the LLMs. Ourwork provides a new perspective for investigating the basic capabilities ofdecomposition and composition of LLMs in N2F. The detailed analysis ofdeficiencies and attributions can help subsequent improvements of LLMs.