Abstract
The BERT family of neural language models have become highly popular due totheir ability to provide sequences of text with rich context-sensitive tokenencodings which are able to generalise well to many Natural Language Processingtasks. Over 120 monolingual BERT models covering over 50 languages have beenreleased, as well as a multilingual model trained on 104 languages. Weintroduce, gaBERT, a monolingual BERT model for the Irish language. We compareour gaBERT model to multilingual BERT and show that gaBERT provides betterrepresentations for a downstream parsing task. We also show how differentfiltering criteria, vocabulary size and the choice of subword tokenisationmodel affect downstream performance. We release gaBERT and related code to thecommunity.