Abstract
Failure-Directed Search (FDS) is a significant complete generic searchalgorithm used in Constraint Programming (CP) to efficiently explore the searchspace, proven particularly effective on scheduling problems. This paperanalyzes FDS's properties, showing that minimizing the size of its search treeguided by ranked branching decisions is closely related to the Multi-armedbandit (MAB) problem. Building on this insight, MAB reinforcement learningalgorithms are applied to FDS, extended with problem-specific refinements andparameter tuning, and evaluated on the two most fundamental schedulingproblems, the Job Shop Scheduling Problem (JSSP) and Resource-ConstrainedProject Scheduling Problem (RCPSP). The resulting enhanced FDS, using the bestextended MAB algorithm and configuration, performs 1.7 times faster on the JSSPand 2.1 times faster on the RCPSP benchmarks compared to the originalimplementation in a new solver called OptalCP, while also being 3.5 timesfaster on the JSSP and 2.1 times faster on the RCPSP benchmarks than thecurrent state-of-the-art FDS algorithm in IBM CP Optimizer 22.1. Furthermore,using only a 900-second time limit per instance, the enhanced FDS improved theexisting state-of-the-art lower bounds of 78 of 84 JSSP and 226 of 393 RCPSPstandard open benchmark instances while also completely closing a few of them.