This work examines the role of reinforcement learning in reducing theseverity of on-road collisions by controlling velocity and steering insituations in which contact is imminent. We construct a model, given cameraimages as input, that is capable of learning and predicting the dynamics ofobstacles, cars and pedestrians, and train our policy using this model. Twopolicies that control both braking and steering are compared against a baselinewhere the only action taken is (conventional) braking in a straight line. Thetwo policies are trained using two distinct reward structures, one where anyand all collisions incur a fixed penalty, and a second one where the penalty iscalculated based on already established delta-v models of injury severity. Theresults show that both policies exceed the performance of the baseline, withthe policy trained using injury models having the highest performance.