Abstract
Reinforcement learning combined with deep neural networks has performedremarkably well in many genres of game recently. It surpassed human-levelperformance in fixed game environments and turn-based two player board games.However, no research has ever shown a result that surpassed human level inmodern complex fighting games, to the best of our knowledge. This is due to theinherent difficulties of modern fighting games, including vast action spaces,real-time constraints, and performance generalizations required for variousopponents. We overcame these challenges and made 1v1 battle AI agents for thecommercial game, "Blade & Soul". The trained agents competed against fiveprofessional gamers and achieved 62% of win rate.This paper presents apractical reinforcement learning method including a novel self-play curriculumand data skipping techniques. Through the curriculum, three different styles ofagents are created by reward shaping, and are trained against each other forrobust performance. Additionally, this paper suggests data skipping techniqueswhich increased data efficiency and facilitated explorations in vast spaces.