Vocabulary Transfer for Medical Texts

  • 2022-08-04 10:53:22
  • Vladislav D. Mosin, Ivan P. Yamshchikov
  • 22

Abstract

Vocabulary transfer is a transfer learning subtask in which language modelsfine-tune with the corpus-specific tokenization instead of the default one,which is being used during pretraining. This usually improves the resultingperformance of the model, and in the paper, we demonstrate that vocabularytransfer is especially beneficial for medical text processing. Using threedifferent medical natural language processing datasets, we show vocabularytransfer to provide up to ten extra percentage points for the downstreamclassifier accuracy.

 

Quick Read (beta)

loading the full paper ...