Abstract
In this paper, we tackle the problem of Unmanned Aerial (UA V) path planningin complex and uncertain environments by designing a Model Predictive Control(MPC), based on a Long-Short-Term Memory (LSTM) network integrated into theDeep Deterministic Policy Gradient algorithm. In the proposed solution,LSTM-MPC operates as a deterministic policy within the DDPG network, and itleverages a predicting pool to store predicted future states and actions forimproved robustness and efficiency. The use of the predicting pool also enablesthe initialization of the critic network, leading to improved convergence speedand reduced failure rate compared to traditional reinforcement learning anddeep reinforcement learning methods. The effectiveness of the proposed solutionis evaluated by numerical simulations.