Parsing the Language of Expression: Enhancing Symbolic Regression with Domain-Aware Symbolic Priors

  • 2025-03-12 18:57:48
  • Sikai Huang, Yixin Berry Wen, Tara Adusumilli, Kusum Choudhary, Haizhao Yang
  • 0

Abstract

Symbolic regression is essential for deriving interpretable expressions thatelucidate complex phenomena by exposing the underlying mathematical andphysical relationships in data. In this paper, we present an advanced symbolicregression method that integrates symbol priors from diverse scientific domains- including physics, biology, chemistry, and engineering - into the regressionprocess. By systematically analyzing domain-specific expressions, we deriveprobability distributions of symbols to guide expression generation. We proposenovel tree-structured recurrent neural networks (RNNs) that leverage thesesymbol priors, enabling domain knowledge to steer the learning process.Additionally, we introduce a hierarchical tree structure for representingexpressions, where unary and binary operators are organized to facilitate moreefficient learning. To further accelerate training, we compile characteristicexpression blocks from each domain and include them in the operator dictionary,providing relevant building blocks. Experimental results demonstrate thatleveraging symbol priors significantly enhances the performance of symbolicregression, resulting in faster convergence and higher accuracy.

 

Quick Read (beta)

loading the full paper ...