Reinforcement learning algorithms are gaining popularity in fields whereoptimal scheduling is important, and oncology is not an exception. The complexand uncertain dynamics of cancer limit the performance of traditionalmodel-based scheduling strategies like Optimal Control. Some preliminaryefforts have already been made to design chemotherapy schedules usingQ-learning considering a discrete action space. Motivated by the recent successof model-free Deep Reinforcement Learning (DRL) in challenging control tasks,we suggest the use of the Deep Q-Network (DQN) and Deep Deterministic PolicyGradient (DDPG) algorithms to design a personalized cancer chemotherapyschedule. We show that both of them succeed in the task and outperform theOptimal Control solution in the presence of uncertainty. Furthermore, we showthat DDPG can exterminate cancer more efficiently than DQN due to itscontinuous action space. Finally, we provide some intuition regarding theamount of samples required for the training.