In this paper, we present our submission for the English to Czech TextTranslation Task of IWSLT 2019. Our system aims to study how pre-trainedlanguage models, used as input embeddings, can improve a specialized machinetranslation system trained on few data. Therefore, we implemented aTransformer-based encoder-decoder neural system which is able to use the outputof a pre-trained language model as input embeddings, and we compared itsperformance under three configurations: 1) without any pre-trained languagemodel (constrained), 2) using a language model trained on the monolingual partsof the allowed English-Czech data (constrained), and 3) using a language modeltrained on a large quantity of external monolingual data (unconstrained). Weused BERT as external pre-trained language model (configuration 3), and BERTarchitecture for training our own language model (configuration 2). Regardingthe training data, we trained our MT system on a small quantity of paralleltext: one set only consists of the provided MuST-C corpus, and the other setconsists of the MuST-C corpus and the News Commentary corpus from WMT. Weobserved that using the external pre-trained BERT improves the scores of oursystem by +0.8 to +1.5 of BLEU on our development set, and +0.97 to +1.94 ofBLEU on the test set. However, using our own language model trained only on theallowed parallel data seems to improve the machine translation performancesonly when the system is trained on the smallest dataset.