Deconstructing and reconstructing word embedding algorithms

  • 2019-11-29 18:27:36
  • Edward Newell, Kian Kenyon-Dean, Jackie Chi Kit Cheung
  • 2

Abstract

Uncontextualized word embeddings are reliable feature representations ofwords used to obtain high quality results for various NLP applications. Giventhe historical success of word embeddings in NLP, we propose a retrospective onsome of the most well-known word embedding algorithms. In this work, wedeconstruct Word2vec, GloVe, and others, into a common form, unveiling some ofthe necessary and sufficient conditions required for making performant wordembeddings. We find that each algorithm: (1) fits vector-covector dot productsto approximate pointwise mutual information (PMI); and, (2) modulates the lossgradient to balance weak and strong signals. We demonstrate that these twoalgorithmic features are sufficient conditions to construct a novel wordembedding algorithm, Hilbert-MLE. We find that its embeddings obtain equivalentor better performance against other algorithms across 17 intrinsic andextrinsic datasets.

 

Quick Read (beta)

loading the full paper ...