Constrained Reinforcement Learning for Dynamic Material Handling

Abstract

As one of the core parts of flexible manufacturing systems, material handlinginvolves storage and transportation of materials between workstations withautomated vehicles. The improvement in material handling can impulse theoverall efficiency of the manufacturing system. However, the occurrence ofdynamic events during the optimisation of task arrangements poses a challengethat requires adaptability and effectiveness. In this paper, we aim at thescheduling of automated guided vehicles for dynamic material handling.Motivated by some real-world scenarios, unknown new tasks and unexpectedvehicle breakdowns are regarded as dynamic events in our problem. We formulatethe problem as a constrained Markov decision process which takes into accounttardiness and available vehicles as cumulative and instantaneous constraints,respectively. An adaptive constrained reinforcement learning algorithm thatcombines Lagrangian relaxation and invalid action masking, named RCPOM, isproposed to address the problem with two hybrid constraints. Moreover, agym-like dynamic material handling simulator, named DMH-GYM, is developed andequipped with diverse problem instances, which can be used as benchmarks fordynamic material handling. Experimental results on the problem instancesdemonstrate the outstanding performance of our proposed approach compared witheight state-of-the-art constrained and non-constrained reinforcement learningalgorithms, and widely used dispatching rules for material handling.

Quick Read (beta)

loading the full paper ...