CRoPE: Efficient Parametrization of Rotary Positional Embedding

  • 2026-04-01 17:35:04
  • Beicheng Lou, Zifei Xu, Vivian W. H. Wong
  • 0

Abstract

Rotary positional embedding has become the state-of-the-art approach to encode position information in transformer-based models. While it is often succinctly expressed in complex linear algebra, we note that the actual implementation of $Q/K/V$-projections is not equivalent to a complex linear transformation. We argue that complex linear transformation is a more natural parametrization and saves near 50\% parameters within the attention block. We show empirically that removing such redundancy has negligible impact on the model performance. Our modification achieves more efficient parameter usage, as well as a cleaner interpretation of the representation space.

 

Quick Read (beta)

loading the full paper ...