We train a recurrent neural network language model using a distributed,on-device learning framework called federated learning for the purpose ofnext-word prediction in a virtual keyboard for smartphones. Server-basedtraining using stochastic gradient descent is compared with training on clientdevices using the Federated Averaging algorithm. The federated algorithm, whichenables training on a higher-quality dataset for this use case, is shown toachieve better prediction recall. This work demonstrates the feasibility and benefit of training languagemodels on client devices without exporting sensitive user data to servers. Thefederated learning environment gives users greater control over their data andsimplifies the task of incorporating privacy by default with distributedtraining and aggregation across a population of client devices.