Autonomous Penetration Testing using Reinforcement Learning

Abstract

Penetration testing (pentesting) involves performing a controlled attack on acomputer system in order to assess it's security. Although an effective methodfor testing security, pentesting requires highly skilled practitioners andcurrently there is a growing shortage of skilled cyber security professionals.One avenue for alleviating this problem is automate the pentesting processusing artificial intelligence techniques. Current approaches to automatedpentesting have relied on model-based planning, however the cyber securitylandscape is rapidly changing making maintaining up-to-date models of exploitsa challenge. This project investigated the application of model-freeReinforcement Learning (RL) to automated pentesting. Model-free RL has the keyadvantage over model-based planning of not requiring a model of theenvironment, instead learning the best policy through interaction with theenvironment. We first designed and built a fast, low compute simulator fortraining and testing autonomous pentesting agents. We did this by framingpentesting as a Markov Decision Process with the known configuration of thenetwork as states, the available scans and exploits as actions, the rewarddetermined by the value of machines on the network. We then used this simulatorto investigate the application of model-free RL to pentesting. We tested thestandard Q-learning algorithm using both tabular and neural network basedimplementations. We found that within the simulated environment both tabularand neural network implementations were able to find optimal attack paths for arange of different network topologies and sizes without having a model ofaction behaviour. However, the implemented algorithms were only practical forsmaller networks and numbers of actions. Further work is needed in developingscalable RL algorithms and testing these algorithms in larger and higherfidelity environments.

Quick Read (beta)

loading the full paper ...