Abstract
Machine learning has become a powerful tool for enhancing data assimilation.While supervised learning remains the standard method, reinforcement learning(RL) offers unique advantages through its sequential decision-making framework,which naturally fits the iterative nature of data assimilation by dynamicallybalancing model forecasts with observations. We develop RL-DAUNCE, a newRL-based method that enhances data assimilation with physical constraintsthrough three key aspects. First, RL-DAUNCE inherits the computationalefficiency of machine learning while it uniquely structures its agents tomirror ensemble members in conventional data assimilation methods. Second,RL-DAUNCE emphasizes uncertainty quantification by advancing multiple ensemblemembers, moving beyond simple mean-state optimization. Third, RL-DAUNCE'sensemble-as-agents design facilitates the enforcement of physical constraintsduring the assimilation process, which is crucial to improving the stateestimation and subsequent forecasting. A primal-dual optimization strategy isdeveloped to enforce constraints, which dynamically penalizes the rewardfunction to ensure constraint satisfaction throughout the learning process.Also, state variable bounds are respected by constraining the RL action space.Together, these features ensure physical consistency without sacrificingefficiency. RL-DAUNCE is applied to the Madden-Julian Oscillation, anintermittent atmospheric phenomenon characterized by strongly non-Gaussianfeatures and multiple physical constraints. RL-DAUNCE outperforms the standardensemble Kalman filter (EnKF), which fails catastrophically due to theviolation of physical constraints. Notably, RL-DAUNCE matches the performanceof constrained EnKF, particularly in recovering intermittent signals, capturingextreme events, and quantifying uncertainties, while requiring substantiallyless computational effort.