Learning to Shoot in First Person Shooter Games by Stabilizing Actions and Clustering Rewards for Reinforcement Learning

Abstract

While reinforcement learning (RL) has been applied to turn-based board gamesfor many years, more complex games involving decision-making in real-time arebeginning to receive more attention. A challenge in such environments is thatthe time that elapses between deciding to take an action and receiving a rewardbased on its outcome can be longer than the interval between successivedecisions. We explore this in the context of a non-player character (NPC) in amodern first-person shooter game. Such games take place in 3D environmentswhere players, both human and computer-controlled, compete by engaging incombat and completing task objectives. We investigate the use of RL to enableNPCs to gather experience from game-play and improve their shooting skill overtime from a reward signal based on the damage caused to opponents. We propose anew method for RL updates and reward calculations, in which the updates arecarried out periodically, after each shooting encounter has ended, and a newweighted-reward mechanism is used which increases the reward applied to actionsthat lead to damaging the opponent in successive hits in what we term "hitclusters".

Quick Read (beta)

loading the full paper ...