### Abstract

The intrinsic high dimension of fluid dynamics is an inherent challenge tocontrol of aerodynamic flows, and this is further complicated by a flow'snonlinear response to strong disturbances. Deep reinforcement learning, whichtakes advantage of the exploratory aspects of reinforcement learning (RL) andthe rich nonlinearity of a deep neural network, provides a promising approachto discover feasible control strategies. However, the typical model-freeapproach to reinforcement learning requires a significant amount of interactionbetween the flow environment and the RL agent during training, and this hightraining cost impedes its development and application. In this work, we proposea model-based reinforcement learning (MBRL) approach by incorporating a novelreduced-order model as a surrogate for the full environment. The model consistsof a physics-augmented autoencoder, which compresses high-dimensional CFD flowfield snaphsots into a three-dimensional latent space, and a latent dynamicsmodel that is trained to accurately predict the long-time dynamics oftrajectories in the latent space in response to action sequences. Therobustness and generalizability of the model is demonstrated in two distinctflow environments, a pitching airfoil in a highly disturbed environment and avertical-axis wind turbine in a disturbance-free environment. Based on thetrained model in the first problem, we realize an MBRL strategy to mitigatelift variation during gust-airfoil encounters. We demonstrate that the policylearned in the reduced-order environment translates to an effective controlstrategy in the full CFD environment.