Abstract
Despite the great success of word embedding, sentence embedding remains anot-well-solved problem. In this paper, we present a supervised learningframework to exploit sentence embedding for the medical question answeringtask. The learning framework consists of two main parts: 1) a sentenceembedding producing module, and 2) a scoring module. The former is developedwith contextual self-attention and multi-scale techniques to encode a sentenceinto an embedding tensor. This module is shortly called Contextualself-Attention Multi-scale Sentence Embedding (CAMSE). The latter employs twoscoring strategies: Semantic Matching Scoring (SMS) and Semantic AssociationScoring (SAS). SMS measures similarity while SAS captures association betweensentence pairs: a medical question concatenated with a candidate choice, and apiece of corresponding supportive evidence. The proposed framework is examinedby two Medical Question Answering(MedicalQA) datasets which are collected fromreal-world applications: medical exam and clinical diagnosis based onelectronic medical records (EMR). The comparison results show that our proposedframework achieved significant improvements compared to competitive baselineapproaches. Additionally, a series of controlled experiments are also conductedto illustrate that the multi-scale strategy and the contextual self-attentionlayer play important roles for producing effective sentence embedding, and thetwo kinds of scoring strategies are highly complementary to each other forquestion answering problems.