Language Transfer for Early Warning of Epidemics from Social Media

  • 2019-10-10 12:42:19
  • Mattias Appelgren, Patrick Schrempf, Matúš Falis, Satoshi Ikeda, Alison Q O'Neil
  • 17

Abstract

Statements on social media can be analysed to identify individuals who areexperiencing red flag medical symptoms, allowing early detection of the spreadof disease such as influenza. Since disease does not respect cultural bordersand may spread between populations speaking different languages, we would liketo build multilingual models. However, the data required to train models forevery language may be difficult, expensive and time-consuming to obtain,particularly for low-resource languages. Taking Japanese as our targetlanguage, we explore methods by which data in one language might be used tobuild models for a different language. We evaluate strategies of training onmachine translated data and of zero-shot transfer through the use ofmultilingual models. We find that the choice of source language impacts theperformance, with Chinese-Japanese being a better language pair thanEnglish-Japanese. Training on machine translated data shows promise, especiallywhen used in conjunction with a small amount of target language data.

 

Quick Read (beta)

loading the full paper ...