Abstract
In planning and reinforcement learning, the identification of common subgoalstructures across problems is important when goals are to be achieved over longhorizons. Recently, it has been shown that such structures can be expressed asfeature-based rules, called sketches, over a number of classical planningdomains. These sketches split problems into subproblems which then becomesolvable in low polynomial time by a greedy sequence of IW$(k)$ searches.Methods for learning sketches using feature pools and min-SAT solvers have beendeveloped, yet they face two key limitations: scalability and expressivity. Inthis work, we address these limitations by formulating the problem of learningsketch decompositions as a deep reinforcement learning (DRL) task, wheregeneral policies are sought in a modified planning problem where the successorstates of a state s are defined as those reachable from s through an IW$(k)$search. The sketch decompositions obtained through this method areexperimentally evaluated across various domains, and problems are regarded assolved by the decomposition when the goal is reached through a greedy sequenceof IW$(k)$ searches. While our DRL approach for learning sketch decompositionsdoes not yield interpretable sketches in the form of rules, we demonstrate thatthe resulting decompositions can often be understood in a crisp manner.