Structured Reinforcement Learning for Combinatorial Decision-Making

Abstract

Reinforcement learning (RL) is increasingly applied to real-world problemsinvolving complex and structured decisions, such as routing, scheduling, andassortment planning. These settings challenge standard RL algorithms, whichstruggle to scale, generalize, and exploit structure in the presence ofcombinatorial action spaces. We propose Structured Reinforcement Learning(SRL), a novel actor-critic framework that embeds combinatorial optimizationlayers into the actor neural network. We enable end-to-end learning of theactor via Fenchel-Young losses and provide a geometric interpretation of SRL asa primal-dual algorithm in the dual of the moment polytope. Across sixenvironments with exogenous and endogenous uncertainty, SRL matches orsurpasses the performance of unstructured RL and imitation learning on statictasks and improves over these baselines by up to 92% on dynamic problems, withimproved stability and convergence speed.

Quick Read (beta)

loading the full paper ...