SECNLP: A Survey of Embeddings in Clinical Natural Language Processing

  • 2019-03-04 01:37:52
  • Kalyan KS, S Sangeetha
  • 5

Abstract

Traditional representations like Bag of words are high dimensional, sparseand ignore the order as well as syntactic and semantic information. Distributedvector representations or embeddings map variable length text to dense fixedlength vectors as well as capture the prior knowledge which can transferred todownstream tasks. Even though embedding has become de facto standard forrepresentations in deep learning based NLP tasks in both general and clinicaldomains, there is no survey paper which presents a detailed review ofembeddings in Clinical Natural Language Processing. In this survey paper, wediscuss various medical corpora and their characteristics, medical codes andpresent a brief overview as well as comparison of popular embeddings models. Weclassify clinical embeddings into nine types and discuss each embedding type indetail. We discuss various evaluation methods followed by possible solutions tovarious challenges in clinical embeddings. Finally, we conclude with some ofthe future directions which will advance the research in clinical embeddings.

 

Introduction (beta)

None

 

Conclusion (beta)

None