Abstract
Contextual pretrained language models, such as BERT (Devlin et al., 2019),have made significant breakthrough in various NLP tasks by training on largescale of unlabeled text re-sources.Financial sector also accumulates largeamount of financial communication text.However, there is no pretrained financespecific language models available. In this work,we address the need bypretraining a financial domain specific BERT models, FinBERT, using a largescale of financial communication corpora. Experiments on three financialsentiment classification tasks confirm the advantage of FinBERT over genericdomain BERT model. The code and pretrained models are available athttps://github.com/yya518/FinBERT. We hope this will be useful forpractitioners and researchers working on financial NLP tasks.