Learn Continuously, Act Discretely: Hybrid Action-Space Reinforcement Learning For Optimal Execution

Abstract

Optimal execution is a sequential decision-making problem for cost-saving inalgorithmic trading. Studies have found that reinforcement learning (RL) canhelp decide the order-splitting sizes. However, a problem remains unsolved: howto place limit orders at appropriate limit prices? The key challenge lies inthe "continuous-discrete duality" of the action space. On the one hand, thecontinuous action space using percentage changes in prices is preferred forgeneralization. On the other hand, the trader eventually needs to choose limitprices discretely due to the existence of the tick size, which requiresspecialization for every single stock with different characteristics (e.g., theliquidity and the price range). So we need continuous control forgeneralization and discrete control for specialization. To this end, we proposea hybrid RL method to combine the advantages of both of them. We first use acontinuous control agent to scope an action subset, then deploy a fine-grainedagent to choose a specific limit price. Extensive experiments show that ourmethod has higher sample efficiency and better training stability than existingRL algorithms and significantly outperforms previous learning-based methods fororder execution.

Quick Read (beta)

loading the full paper ...