TinyLlama: An Open-Source Small Language Model

  • 2024-01-04 17:54:59
  • Peiyuan Zhang, Guangtao Zeng, Tianduo Wang, Wei Lu
  • 0

Abstract

We present TinyLlama, a compact 1.1B language model pretrained on around 1trillion tokens for approximately 3 epochs. Building on the architecture andtokenizer of Llama 2, TinyLlama leverages various advances contributed by theopen-source community (e.g., FlashAttention), achieving better computationalefficiency. Despite its relatively small size, TinyLlama demonstratesremarkable performance in a series of downstream tasks. It significantlyoutperforms existing open-source language models with comparable sizes. Ourmodel checkpoints and code are publicly available on GitHub athttps://github.com/jzhang38/TinyLlama.

 

Quick Read (beta)

loading the full paper ...