InfiR : Crafting Effective Small Language Models and Multimodal Small Language Models in Reasoning

  • 2025-02-17 09:07:32
  • Congkai Xie, Shuo Cai, Wenjun Wang, Pengxiang Li, Zhijie Sang, Kejing Yang, Yiming Zhang, Zhen Li, Guanghao Zhu, Zeyu Liu, Yang Yu, Yuhang Liu, Su Lu, Baoyi He, Qi Zhou, Xiaotian Han, Jianbo Yuan, Shengyu Zhang, Fei Wu, Hongxia Yang
  • 0

Abstract

Large Language Models (LLMs) and Multimodal Large Language Models (MLLMs)have made significant advancements in reasoning capabilities. However, theystill face challenges such as high computational demands and privacy concerns.This paper focuses on developing efficient Small Language Models (SLMs) andMultimodal Small Language Models (MSLMs) that retain competitive reasoningabilities. We introduce a novel training pipeline that enhances reasoningcapabilities and facilitates deployment on edge devices, achievingstate-of-the-art performance while minimizing development costs. \InfR~ aims toadvance AI systems by improving reasoning, reducing adoption barriers, andaddressing privacy concerns through smaller model sizes. Resources areavailable at https://github. com/Reallm-Labs/InfiR.

 

Quick Read (beta)

loading the full paper ...