In dynamic environments, learned controllers are supposed to take motion intoaccount when selecting the action to be taken. However, in existingreinforcement learning works motion is rarely treated explicitly; it is ratherassumed that the controller learns the necessary motion representation fromtemporal stacks of frames implicitly. In this paper, we show that forcontinuous control tasks learning an explicit representation of motion improvesthe quality of the learned controller in dynamic scenarios. We demonstrate thison common benchmark tasks (Walker, Swimmer, Hopper), on target reaching andball catching tasks with simulated robotic arms, and on a dynamic single balljuggling task. Moreover, we find that when equipped with an appropriate networkarchitecture, the agent can, on some tasks, learn motion features also withpure reinforcement learning, without additional supervision. Further we findthat using an image difference between the current and the previous frame as anadditional input leads to better results than a temporal stack of frames.