Neuromorphic devices represent an attempt to mimic aspects of the brain'sarchitecture and dynamics with the aim of replicating its hallmark functionalcapabilities in terms of computational power, robust learning and energyefficiency. We employ a single-chip prototype of the BrainScaleS 2 neuromorphicsystem to implement a proof-of-concept demonstration of reward-modulatedspike-timing-dependent plasticity in a spiking network that learns to play thePong video game by smooth pursuit. This system combines an electronicmixed-signal substrate for emulating neuron and synapse dynamics with anembedded digital processor for on-chip learning, which in this work also servesto simulate the virtual environment and learning agent. The analog emulation ofneuronal membrane dynamics enables a 1000-fold acceleration with respect tobiological real-time, with the entire chip operating on a power budget of 57mW.Compared to an equivalent simulation using state-of-the-art software, theon-chip emulation is at least one order of magnitude faster and three orders ofmagnitude more energy-efficient. We demonstrate how on-chip learning canmitigate the effects of fixed-pattern noise, which is unavoidable in analogsubstrates, while making use of temporal variability for action exploration.Learning compensates imperfections of the physical substrate, as manifested inneuronal parameter variability, by adapting synaptic weights to matchrespective excitability of individual neurons.