Lexical Sememe Prediction using Dictionary Definitions by Capturing Local Semantic Correspondence

Abstract

Sememes, defined as the minimum semantic units of human languages inlinguistics, have been proven useful in many NLP tasks. Since manualconstruction and update of sememe knowledge bases (KBs) are costly, the task ofautomatic sememe prediction has been proposed to assist sememe annotation. Inthis paper, we explore the approach of applying dictionary definitions topredicting sememes for unannotated words. We find that sememes of each word areusually semantically matched to different words in its dictionary definition,and we name this matching relationship local semantic correspondence.Accordingly, we propose a Sememe Correspondence Pooling (SCorP) model, which isable to capture this kind of matching to predict sememes. We evaluate our modeland baseline methods on a famous sememe KB HowNet and find that our modelachieves state-of-the-art performance. Moreover, further quantitative analysisshows that our model can properly learn the local semantic correspondencebetween sememes and words in dictionary definitions, which explains theeffectiveness of our model. The source codes of this paper can be obtained fromhttps://github.com/thunlp/scorp.

Quick Read (beta)

loading the full paper ...