Abstract
We introduce Spline-based Transformers, a novel class of Transformer modelsthat eliminate the need for positional encoding. Inspired by workflows usingsplines in computer animation, our Spline-based Transformers embed an inputsequence of elements as a smooth trajectory in latent space. Overcomingdrawbacks of positional encoding such as sequence length extrapolation,Spline-based Transformers also provide a novel way for users to interact withtransformer latent spaces by directly manipulating the latent control points tocreate new latent trajectories and sequences. We demonstrate the superiorperformance of our approach in comparison to conventional positional encodingon a variety of datasets, ranging from synthetic 2D to large-scale real-worlddatasets of images, 3D shapes, and animations.