Unsupervised Clinical Language Translation

  • 2019-02-04 13:47:18
  • Wei-Hung Weng, Yu-An Chung, Peter Szolovits
  • 2


As patients' access to their doctors' clinical notes becomes common,translating professional, clinical jargon to layperson-understandable languageis essential to improve patient-clinician communication. Such translationyields better clinical outcomes by enhancing patients' understanding of theirown health conditions, and thus improving patients' involvement in their owncare. Existing research has used dictionary-based word replacement ordefinition insertion to approach the need. However, these methods are limitedby expert curation, which is hard to scale and has trouble generalizing tounseen datasets that do not share an overlapping vocabulary. In contrast, weapproach the clinical word and sentence translation problem in a completelyunsupervised manner. We show that a framework using representation learning,bilingual dictionary induction and statistical machine translation yields thebest precision at 10 of 0.827 on professional-to-consumer word translation, andmean opinion scores of 4.10 and 4.28 out of 5 for clinical correctness andlayperson readability, respectively, on sentence translation. Ourfully-unsupervised strategy overcomes the curation problem, and the clinicallymeaningful evaluation reduces biases from inappropriate evaluators, which arecritical in clinical machine learning.


Introduction (beta)



Conclusion (beta)