Abstract
Advances in reinforcement learning (RL) have led to its successfulapplication in complex tasks with continuous state and action spaces. Despitethese advances in practice, most theoretical work pertains to finite state andaction spaces. We propose building a theoretical understanding of continuousstate and action spaces by employing a geometric lens to understand the locallyattained set of states. The set of all parametrised policies learnt through asemi-gradient based approach induces a set of attainable states in RL. We showthat the training dynamics of a two-layer neural policy induce a lowdimensional manifold of attainable states embedded in the high-dimensionalnominal state space trained using an actor-critic algorithm. We prove that,under certain conditions, the dimensionality of this manifold is of the orderof the dimensionality of the action space. This is the first result of itskind, linking the geometry of the state space to the dimensionality of theaction space. We empirically corroborate this upper bound for four MuJoCoenvironments and also demonstrate the results in a toy environment with varyingdimensionality. We also show the applicability of this theoretical result byintroducing a local manifold learning layer to the policy and value functionnetworks to improve the performance in control environments with very highdegrees of freedom by changing one layer of the neural network to learn sparserepresentations.