Abstract
The design of reward functions in reinforcement learning is a human skillthat comes with experience. Unfortunately, there is not any methodology in theliterature that could guide a human to design the reward function or to allow ahuman to transfer the skills developed in designing reward functions to anotherhuman and in a systematic manner. In this paper, we use SystematicInstructional Design, an approach in human education, to engineer a machineeducation methodology to design reward functions for reinforcement learning. Wedemonstrate the methodology in designing a hierarchical genetic reinforcementlearner that adopts a neural network representation to evolve a swarmcontroller for an agent shepherding a boids-based swarm. The results revealthat the methodology is able to guide the design of hierarchical reinforcementlearners, with each model in the hierarchy learning incrementally through amulti-part reward function. The hierarchy acts as a decision fusion functionthat combines the individual behaviours and skills learnt by each instructionto create a smart shepherd to control the swarm.