Abstract
We propose a new deep learning model for goal-driven tasks that requireintuitive physical reasoning and intervention in the scene to achieve a desiredend goal. Its modular structure is motivated by hypothesizing a sequence ofintuitive steps that humans apply when trying to solve such a task. The modelfirst predicts the path the target object would follow without intervention andthe path the target object should follow in order to solve the task. Next, itpredicts the desired path of the action object and generates the placement ofthe action object. All components of the model are trained jointly in asupervised way; each component receives its own learning signal but learningsignals are also backpropagated through the entire architecture. To evaluatethe model we use PHYRE - a benchmark test for goal-driven physical reasoning in2D mechanics puzzles.