Financial Trading as a Game: A Deep Reinforcement Learning Approach

Abstract

An automatic program that generates constant profit from the financial marketis lucrative for every market practitioner. Recent advance in deepreinforcement learning provides a framework toward end-to-end training of suchtrading agent. In this paper, we propose an Markov Decision Process (MDP) modelsuitable for the financial trading task and solve it with the state-of-the-artdeep recurrent Q-network (DRQN) algorithm. We propose several modifications tothe existing learning algorithm to make it more suitable under the financialtrading setting, namely 1. We employ a substantially small replay memory (onlya few hundreds in size) compared to ones used in modern deep reinforcementlearning algorithms (often millions in size.) 2. We develop an actionaugmentation technique to mitigate the need for random exploration by providingextra feedback signals for all actions to the agent. This enables us to usegreedy policy over the course of learning and shows strong empiricalperformance compared to more commonly used epsilon-greedy exploration. However,this technique is specific to financial trading under a few market assumptions.3. We sample a longer sequence for recurrent neural network training. A sideproduct of this mechanism is that we can now train the agent for every T steps.This greatly reduces training time since the overall computation is down by afactor of T. We combine all of the above into a complete online learningalgorithm and validate our approach on the spot foreign exchange market.

Quick Read (beta)

loading the full paper ...