In recent years, the interest in leveraging quantum effects for enhancingmachine learning tasks has significantly increased. Many algorithms speeding upsupervised and unsupervised learning were established. The first framework inwhich ways to exploit quantum resources specifically for the broader context ofreinforcement learning were found is projective simulation. Projectivesimulation presents an agent-based reinforcement learning approach designed ina manner which may support quantum walk-based speed-ups. Although classicalvariants of projective simulation have been benchmarked against commonreinforcement learning algorithms, very few formal theoretical analyses havebeen provided for its performance in standard learning scenarios. In thispaper, we provide a detailed formal discussion of the properties of this model.Specifically, we prove that one version of the projective simulation model,understood as a reinforcement learning approach, converges to optimal behaviorin a large class of Markov decision processes. This proof shows that aphysically-inspired approach to reinforcement learning can guarantee toconverge.