Seaweed-7B: Cost-Effective Training of Video Generation Foundation Model

  • 2025-04-11 17:46:20
  • Team Seawead, Ceyuan Yang, Zhijie Lin, Yang Zhao, Shanchuan Lin, Zhibei Ma, Haoyuan Guo, Hao Chen, Lu Qi, Sen Wang, Feng Cheng, Feilong Zuo Xuejiao Zeng, Ziyan Yang, Fangyuan Kong, Zhiwu Qing, Fei Xiao, Meng Wei, Tuyen Hoang, Siyu Zhang, Peihao Zhu, Qi Zhao, Jiangqiao Yan, Liangke Gui, Sheng Bi, Jiashi Li, Yuxi Ren, Rui Wang, Huixia Li, Xuefeng Xiao, Shu Liu, Feng Ling, Heng Zhang, Houmin Wei, Huafeng Kuang, Jerry Duncan, Junda Zhang, Junru Zheng, Li Sun, Manlin Zhang, Renfei Sun, Xiaobin Zhuang, Xiaojie Li, Xin Xia, Xuyan Chi, Yanghua Peng, Yuping Wang, Yuxuan Wang, Zhongkai Zhao, Zhuo Chen, Zuquan Song, Zhenheng Yang, Jiashi Feng, Jianchao Yang, Lu Jiang
  • 0

Abstract

This technical report presents a cost-efficient strategy for training a videogeneration foundation model. We present a mid-sized research model withapproximately 7 billion parameters (7B) called Seaweed-7B trained from scratchusing 665,000 H100 GPU hours. Despite being trained with moderate computationalresources, Seaweed-7B demonstrates highly competitive performance compared tocontemporary video generation models of much larger size. Design choices areespecially crucial in a resource-constrained setting. This technical reporthighlights the key design decisions that enhance the performance of themedium-sized diffusion model. Empirically, we make two observations: (1)Seaweed-7B achieves performance comparable to, or even surpasses, larger modelstrained on substantially greater GPU resources, and (2) our model, whichexhibits strong generalization ability, can be effectively adapted across awide range of downstream applications either by lightweight fine-tuning orcontinue training. See the project page at https://seaweed.video/

 

Quick Read (beta)

loading the full paper ...