Abstract
In the domain of continuous control, deep reinforcement learning (DRL)demonstrates promising results. However, the dependence of DRL on deep neuralnetworks (DNNs) results in the demand for extensive data and increasedcomputational complexity. To address this issue, a novel hybrid architecturefor actor-critic reinforcement learning (RL) algorithms is introduced. Theproposed architecture integrates the broad learning system (BLS) with DNN,aiming to merge the strengths of both distinct architectural paradigms.Specifically, the critic network is implemented using BLS, while the actornetwork is constructed with a DNN. For the estimations of the critic networkparameters, ridge regression is employed, and the parameters of the actornetwork are optimized through gradient descent. The effectiveness of theproposed algorithm is evaluated by applying it to two classic continuouscontrol tasks, and its performance is compared with the widely recognized deepdeterministic policy gradient (DDPG) algorithm. Numerical results show that theproposed algorithm is superior to the DDPG algorithm in terms of computationalefficiency, along with an accelerated learning trajectory. Application of theproposed algorithm in other actor-critic RL algorithms is suggested forinvestigation in future studies.