Parallelized Autoregressive Visual Generation

Abstract

Autoregressive models have emerged as a powerful approach for visualgeneration but suffer from slow inference speed due to their sequentialtoken-by-token prediction process. In this paper, we propose a simple yeteffective approach for parallelized autoregressive visual generation thatimproves generation efficiency while preserving the advantages ofautoregressive modeling. Our key insight is that parallel generation depends onvisual token dependencies-tokens with weak dependencies can be generated inparallel, while strongly dependent adjacent tokens are difficult to generatetogether, as their independent sampling may lead to inconsistencies. Based onthis observation, we develop a parallel generation strategy that generatesdistant tokens with weak dependencies in parallel while maintaining sequentialgeneration for strongly dependent local tokens. Our approach can be seamlesslyintegrated into standard autoregressive models without modifying thearchitecture or tokenizer. Experiments on ImageNet and UCF-101 demonstrate thatour method achieves a 3.6x speedup with comparable quality and up to 9.5xspeedup with minimal quality degradation across both image and video generationtasks. We hope this work will inspire future research in efficient visualgeneration and unified autoregressive modeling. Project page:https://yuqingwang1029.github.io/PAR-project.

Quick Read (beta)

loading the full paper ...