On Creating an English-Thai Code-switched Machine Translation in Medical Domain

  • 2024-10-21 18:25:32
  • Parinthapat Pengpun, Krittamate Tiankanon, Amrest Chinkamol, Jiramet Kinchagawat, Pitchaya Chairuengjitjaras, Pasit Supholkhan, Pubordee Aussavavirojekul, Chiraphat Boonnag, Kanyakorn Veerakanjana, Hirunkul Phimsiri, Boonthicha Sae-jia, Nattawach Sataudom, Piyalitt Ittichaiwong, Peerat Limkonchotiwat
  • 0

Abstract

Machine translation (MT) in the medical domain plays a pivotal role inenhancing healthcare quality and disseminating medical knowledge. Despiteadvancements in English-Thai MT technology, common MT approaches oftenunderperform in the medical field due to their inability to precisely translatemedical terminologies. Our research prioritizes not merely improvingtranslation accuracy but also maintaining medical terminology in English withinthe translated text through code-switched (CS) translation. We developed amethod to produce CS medical translation data, fine-tuned a CS translationmodel with this data, and evaluated its performance against strong baselines,such as Google Neural Machine Translation (NMT) and GPT-3.5/GPT-4. Our modeldemonstrated competitive performance in automatic metrics and was highlyfavored in human preference evaluations. Our evaluation result also shows thatmedical professionals significantly prefer CS translations that maintaincritical English terms accurately, even if it slightly compromises fluency. Ourcode and test set are publicly availablehttps://github.com/preceptorai-org/NLLB_CS_EM_NLP2024.

 

Quick Read (beta)

loading the full paper ...