Abstract
Pretrained language models are now ubiquitous in Natural Language Processing.Despite their success, most available models have either been trained onEnglish data or on the concatenation of data in multiple languages. This makespractical use of such models --in all languages except English-- very limited.Aiming to address this issue for French, we release CamemBERT, a French versionof the Bi-directional Encoders for Transformers (BERT). We measure theperformance of CamemBERT compared to multilingual models in multiple downstreamtasks, namely part-of-speech tagging, dependency parsing, named-entityrecognition, and natural language inference. CamemBERT improves the state ofthe art for most of the tasks considered. We release the pretrained model forCamemBERT hoping to foster research and downstream applications for French NLP.