Abstract
Recovering the structure of causal graphical models from observational datais an essential yet challenging task for causal discovery in scientificscenarios. Domain-specific causal discovery usually relies on expert validationor prior analysis to improve the reliability of recovered causality, which isyet limited by the scarcity of expert resources. Recently, Large LanguageModels (LLM) have been used for causal analysis across various domain-specificscenarios, suggesting its potential as autonomous expert roles in guidingdata-based structure learning. However, integrating LLMs into causal discoveryfaces challenges due to inaccuracies in LLM-based reasoning on revealing theactual causal structure. To address this challenge, we propose anerror-tolerant LLM-driven causal discovery framework. The error-tolerantmechanism is designed three-fold with sufficient consideration on potentialinaccuracies. In the LLM-based reasoning process, an accuracy-orientedprompting strategy restricts causal analysis to a reliable range. Next, aknowledge-to-structure transition aligns LLM-derived causal statements withstructural causal interactions. In the structure learning process, thegoodness-of-fit to data and adherence to LLM-derived priors are balanced tofurther address prior inaccuracies. Evaluation of eight real-world causalstructures demonstrates the efficacy of our LLM-driven approach in improvingdata-based causal discovery, along with its robustness to inaccurateLLM-derived priors. Codes are available at https://github.com/tyMadara/LLM-CD.