Applying reinforcement learning to autonomous driving entails certainchallenges, primarily due to massive traffic flows, which change dynamically.To address such challenges, it is necessary to quickly determine responsestrategies to the changing intentions of surrounding vehicles. Accordingly, wepropose a new policy optimization method for safe driving using graph-basedinteraction-aware constraints. In this framework, the motion prediction andcontrol modules are trained simultaneously, while sharing a latentrepresentation that contains a social context. Further, to reflect socialinteractions, we express the movements of agents in the graph form and filterthe features. This helps preserve the spatiotemporal locality of adjacentnodes. Furthermore, we create feedback loops to combine these two moduleseffectively. As a result, this approach encourages the learned controller to besafe from dynamic risks and also renders the motion prediction robust undervarious situations. In the experiment, we set up a navigation scenariocomprising various situations, with CARLA, an urban driving simulator. Theexperiments show state-of-the-art performance on the sides of both navigationstrategy and motion prediction compared to the baselines.