Janus-Pro: Unified Multimodal Understanding and Generation with Data and Model Scaling

  • 2025-01-29 18:00:19
  • Xiaokang Chen, Zhiyu Wu, Xingchao Liu, Zizheng Pan, Wen Liu, Zhenda Xie, Xingkai Yu, Chong Ruan
  • 0

Abstract

In this work, we introduce Janus-Pro, an advanced version of the previouswork Janus. Specifically, Janus-Pro incorporates (1) an optimized trainingstrategy, (2) expanded training data, and (3) scaling to larger model size.With these improvements, Janus-Pro achieves significant advancements in bothmultimodal understanding and text-to-image instruction-following capabilities,while also enhancing the stability of text-to-image generation. We hope thiswork will inspire further exploration in the field. Code and models arepublicly available.

 

Quick Read (beta)

loading the full paper ...