Abstract
Designing effective embodied multi-agent systems is critical for solvingcomplex real-world tasks across domains. Due to the complexity of multi-agentembodied systems, existing methods fail to automatically generate safe andefficient training data for such systems. To this end, we propose the conceptof compositional constraints for embodied multi-agent systems, addressing thechallenges arising from collaboration among embodied agents. We design variousinterfaces tailored to different types of constraints, enabling seamlessinteraction with the physical world. Leveraging compositional constraints andspecifically designed interfaces, we develop an automated data collectionframework for embodied multi-agent systems and introduce the first benchmarkfor embodied multi-agent manipulation, RoboFactory. Based on RoboFactorybenchmark, we adapt and evaluate the method of imitation learning and analyzedits performance in different difficulty agent tasks. Furthermore, we explorethe architectures and training strategies for multi-agent imitation learning,aiming to build safe and efficient embodied multi-agent systems.