Online Reinforcement Learning-Based Dynamic Adaptive Evaluation Function for Real-Time Strategy Tasks

Abstract

Effective evaluation of real-time strategy tasks requires adaptive mechanismsto cope with dynamic and unpredictable environments. This study proposes amethod to improve evaluation functions for real-time responsiveness tobattle-field situation changes, utilizing an online reinforcementlearning-based dynam-ic weight adjustment mechanism within the real-timestrategy game. Building on traditional static evaluation functions, the methodemploys gradient descent in online reinforcement learning to update weightsdynamically, incorporating weight decay techniques to ensure stability.Additionally, the AdamW optimizer is integrated to adjust the learning rate anddecay rate of online reinforcement learning in real time, further reducing thedependency on manual parameter tun-ing. Round-robin competition experimentsdemonstrate that this method signifi-cantly enhances the applicationeffectiveness of the Lanchester combat model evaluation function, Simpleevaluation function, and Simple Sqrt evaluation function in planning algorithmsincluding IDABCD, IDRTMinimax, and Port-folio AI. The method achieves a notableimprovement in scores, with the en-hancement becoming more pronounced as themap size increases. Furthermore, the increase in evaluation functioncomputation time induced by this method is kept below 6% for all evaluationfunctions and planning algorithms. The pro-posed dynamic adaptive evaluationfunction demonstrates a promising approach for real-time strategy taskevaluation.

Quick Read (beta)

loading the full paper ...