Abstract
The ability to synthesize realistic and diverse indoor furniture layoutsautomatically or based on partial input, unlocks many applications, from betterinteractive 3D tools to data synthesis for training and simulation. In thispaper, we present ATISS, a novel autoregressive transformer architecture forcreating diverse and plausible synthetic indoor environments, given only theroom type and its floor plan. In contrast to prior work, which poses scenesynthesis as sequence generation, our model generates rooms as unordered setsof objects. We argue that this formulation is more natural, as it makes ATISSgenerally useful beyond fully automatic room layout synthesis. For example, thesame trained model can be used in interactive applications for general scenecompletion, partial room re-arrangement with any objects specified by the user,as well as object suggestions for any partial room. To enable this, our modelleverages the permutation equivariance of the transformer when conditioning onthe partial scene, and is trained to be permutation-invariant across objectorderings. Our model is trained end-to-end as an autoregressive generativemodel using only labeled 3D bounding boxes as supervision. Evaluations on fourroom types in the 3D-FRONT dataset demonstrate that our model consistentlygenerates plausible room layouts that are more realistic than existing methods.In addition, it has fewer parameters, is simpler to implement and train andruns up to 8 times faster than existing methods.