Reinforcement learning algorithms are gaining popularity in fields whereoptimal scheduling is important, and oncology is not an exception. The complexand uncertain dynamics of cancer limit the performance of traditionalmodel-based scheduling strategies like Optimal Control. Motivated by the recentsuccess of model-free Deep Reinforcement Learning (DRL) in challenging controltasks and in medical treatments, we use Deep Q-Network (DQN) and DeepDeterministic Policy Gradient (DDPG) to design a personalized cancerchemotherapy schedule. We show that both of them succeed in the task andoutperform the Optimal Control solution in the presence of uncertainty.Furthermore, we show that DDPG can exterminate cancer more efficiently than DQNdue to its continuous action space. Finally, we provide some intuitionregarding the amount of samples required for the training.