Semantic projection: recovering human knowledge of multiple, distinct object features from word embeddings

  • 2018-03-06 06:15:46
  • Gabriel Grand, Idan Asher Blank, Francisco Pereira, Evelina Fedorenko
  • 0


The words of a language reflect the structure of the human mind, allowing usto transmit thoughts between individuals. However, language can represent onlya subset of our rich and detailed cognitive architecture. Here, we ask whatkinds of common knowledge (semantic memory) are captured by word meanings(lexical semantics). We examine a prominent computational model that representswords as vectors in a multidimensional space, such that proximity betweenword-vectors approximates semantic relatedness. Because related words appear insimilar contexts, such spaces - called "word embeddings" - can be learned frompatterns of lexical co-occurrences in natural language. Despite theirpopularity, a fundamental concern about word embeddings is that they appear tobe semantically "rigid": inter-word proximity captures only overall similarity,yet human judgments about object similarities are highly context-dependent andinvolve multiple, distinct semantic features. For example, dolphins andalligators appear similar in size, but differ in intelligence andaggressiveness. Could such context-dependent relationships be recovered fromword embeddings? To address this issue, we introduce a powerful, domain-generalsolution: "semantic projection" of word-vectors onto lines that representvarious object features, like size (the line extending from the word "small" to"big"), intelligence (from "dumb" to "smart"), or danger (from "safe" to"dangerous"). This method, which is intuitively analogous to placing objects"on a mental scale" between two extremes, recovers human judgments across arange of object categories and properties. We thus show that word embeddingsinherit a wealth of common knowledge from word co-occurrence statistics and canbe flexibly manipulated to express context-dependent meanings.


Quick Read (beta)

loading the full paper ...