Abstract
This paper addresses the task of generating two-character onlineinteractions. Previously, two main settings existed for two-characterinteraction generation: (1) generating one's motions based on the counterpart'scomplete motion sequence, and (2) jointly generating two-character motionsbased on specific conditions. We argue that these settings fail to model theprocess of real-life two-character interactions, where humans will react totheir counterparts in real time and act as independent individuals. Incontrast, we propose an online reaction policy, called Ready-to-React, togenerate the next character pose based on past observed motions. Each characterhas its own reaction policy as its "brain", enabling them to interact like realhumans in a streaming manner. Our policy is implemented by incorporating adiffusion head into an auto-regressive model, which can dynamically respond tothe counterpart's motions while effectively mitigating the error accumulationthroughout the generation process. We conduct comprehensive experiments usingthe challenging boxing task. Experimental results demonstrate that our methodoutperforms existing baselines and can generate extended motion sequences.Additionally, we show that our approach can be controlled by sparse signals,making it well-suited for VR and other online interactive environments.