Improved Long Short-Term Memory-based Wastewater Treatment Simulators for Deep Reinforcement Learning

Abstract

Even though Deep Reinforcement Learning (DRL) showed outstanding results inthe fields of Robotics and Games, it is still challenging to implement it inthe optimization of industrial processes like wastewater treatment. One of thechallenges is the lack of a simulation environment that will represent theactual plant as accurately as possible to train DRL policies. Stochasticity andnon-linearity of wastewater treatment data lead to unstable and incorrectpredictions of models over long time horizons. One possible reason for themodels' incorrect simulation behavior can be related to the issue ofcompounding error, which is the accumulation of errors throughout thesimulation. The compounding error occurs because the model utilizes itspredictions as inputs at each time step. The error between the actual data andthe prediction accumulates as the simulation continues. We implemented twomethods to improve the trained models for wastewater treatment data, whichresulted in more accurate simulators: 1- Using the model's prediction data asinput in the training step as a tool of correction, and 2- Change in the lossfunction to consider the long-term predicted shape (dynamics). The experimentalresults showed that implementing these methods can improve the behavior ofsimulators in terms of Dynamic Time Warping throughout a year up to 98%compared to the base model. These improvements demonstrate significant promisein creating simulators for biological processes that do not need pre-existingknowledge of the process but instead depend exclusively on time series dataobtained from the system.

Quick Read (beta)

loading the full paper ...