RAMario: Experimental Approach to Reptile Algorithm -- Reinforcement Learning for Mario

Abstract

This research paper presents an experimental approach to using the Reptilealgorithm for reinforcement learning to train a neural network to play SuperMario Bros. We implement the Reptile algorithm using the Super Mario Bros Gymlibrary and TensorFlow in Python, creating a neural network model with a singleconvolutional layer, a flatten layer, and a dense layer. We define theoptimizer and use the Reptile class to create an instance of the Reptilemeta-learning algorithm. We train the model using multiple tasks and episodes,choosing actions using the current weights of the neural network model, takingthose actions in the environment, and updating the model weights using theReptile algorithm. We evaluate the performance of the algorithm by printing thetotal reward for each episode. In addition, we compare the performance of theReptile algorithm approach to two other popular reinforcement learningalgorithms, Proximal Policy Optimization (PPO) and Deep Q-Network (DQN),applied to the same Super Mario Bros task. Our results demonstrate that theReptile algorithm provides a promising approach to few-shot learning in videogame AI, with comparable or even better performance than the other twoalgorithms, particularly in terms of moves vs distance that agent performs for1M episodes of training. The results shows that best total distance for world1-2 in the game environment were ~1732 (PPO), ~1840 (DQN) and ~2300 (RAMario).Full code is available at https://github.com/s4nyam/RAMario.

Quick Read (beta)

loading the full paper ...