Are All Good Word Vector Spaces Isomorphic?

  • 2020-10-20 17:22:02
  • Ivan Vulić, Sebastian Ruder, Anders Søgaard
  • 0

Abstract

Existing algorithms for aligning cross-lingual word vector spaces assume thatvector spaces are approximately isomorphic. As a result, they perform poorly orfail completely on non-isomorphic spaces. Such non-isomorphism has beenhypothesised to result from typological differences between languages. In thiswork, we ask whether non-isomorphism is also crucially a sign of degenerateword vector spaces. We present a series of experiments across diverse languageswhich show that variance in performance across language pairs is not only dueto typological differences, but can mostly be attributed to the size of themonolingual resources available, and to the properties and duration ofmonolingual training (e.g. "under-training").

 

Quick Read (beta)

loading the full paper ...