Large vocabulary speech recognition for languages of Africa: multilingual modeling and self-supervised learning

Abstract

Almost none of the 2,000+ languages spoken in Africa have widely availableautomatic speech recognition systems, and the required data is also onlyavailable for a few languages. We have experimented with two techniques whichmay provide pathways to large vocabulary speech recognition for Africanlanguages: multilingual modeling and self-supervised learning. We gatheredavailable open source data and collected data for 15 languages, and trainedexperimental models using these techniques. Our results show that pooling thesmall amounts of data available in multilingual end-to-end models, andpre-training on unsupervised data can help improve speech recognition qualityfor many African languages.

Quick Read (beta)

loading the full paper ...