Abstract
Domain adaptation or transfer learning using pre-trained language models suchas BERT has proven to be an effective approach for many natural languageprocessing tasks. In this work, we propose to formulate word sensedisambiguation as a relevance ranking task, and fine-tune BERT on sequence-pairranking task to select the most probable sense definition given a contextsentence and a list of candidate sense definitions. We also introduce a dataaugmentation technique for WSD using existing example sentences from WordNet.Using the proposed training objective and data augmentation technique, ourmodels are able to achieve state-of-the-art results on the English all-wordsbenchmark datasets.
Quick Read (beta)
loading the full paper ...