AraGPT2: Pre-Trained Transformer for Arabic Language Generation

  • 2020-12-31 09:48:05
  • Wissam Antoun, Fady Baly, Hazem Hajj
  • 4

Abstract

Recently, pretrained transformer-based architectures have proven to be veryefficient at language modeling and understanding, given that they are trainedon a large enough corpus. Applications in language generation for Arabic isstill lagging in comparison to other NLP advances primarily due to the lack ofadvanced Arabic language generation models. In this paper, we develop the firstadvanced Arabic language generation model, AraGPT2, trained from scratch onlarge Arabic corpora of internet text and news articles. Our largest model,AraGPT2-mega, has 1.46 billion parameters, which makes it the largest Arabiclanguage model available. We evaluate different size variants of AraGPT2 usingthe perplexity measure, where AraGPT2-mega achieves a perplexity of 29.8 onheld-out articles from Wikipedia. Pretrained variants of AraGPT2 (base, medium,large, mega) are publicly available onhttps://github.com/aub-mind/arabert/aragpt2 hoping to encourage new researchdirections and applications for Arabic NLP.

 

Quick Read (beta)

loading the full paper ...