Abstract
Complex dynamical systems, from macromolecules to ecosystems, are oftenmodeled by stochastic differential equations. To learn such models from data, acommon approach involves sparse selection among a large function library.However, we show that overfitting arises - not just from individual modelcomplexity, but also from the combinatorial growth of possible models. Toaddress this, we introduce Parsimonious Stochastic Inference (PASTIS), aprincipled method combining likelihood-estimation statistics with extreme valuetheory to suppress superfluous parameters. PASTIS outperforms existing methodsand reliably identifies minimal models, even with low sampling rates ormeasurement error. It extends to stochastic partial differential equations, andapplies to ecological networks and reaction-diffusion dynamics.