Abstract
Deep reinforcement learning (DRL) provides a promising way for learningnavigation in complex autonomous driving scenarios. However, identifying thesubtle cues that can indicate drastically different outcomes remains an openproblem with designing autonomous systems that operate in human environments.In this work, we show that explicitly inferring the latent state and encodingspatial-temporal relationships in a reinforcement learning framework can helpaddress this difficulty. We encode prior knowledge on the latent states ofother drivers through a framework that combines the reinforcement learner witha supervised learner. In addition, we model the influence passing betweendifferent vehicles through graph neural networks (GNNs). The proposed frameworksignificantly improves performance in the context of navigating T-intersectionscompared with state-of-the-art baseline approaches.