NSCaching: Simple and Efficient Negative Sampling for Knowledge Graph Embedding

  • 2019-01-18 11:34:27
  • Yongqi Zhang, Quanming Yao, Yingxia Shao, Lei Chen
  • 0

Abstract

Knowledge Graph (KG) embedding is a fundamental problem in data miningresearch with many real-world applications. It aims to encode the entities andrelations in the graph into low dimensional vector space, which can be used forsubsequent algorithms. Negative sampling, which samples negative triplets fromnon-observed ones in the training data, is an important step in KG embedding.Recently, generative adversarial network (GAN), has been introduced in negativesampling. By sampling negative triplets with large scores, these methods avoidthe problem of vanishing gradient and thus obtain better performance. However,using GAN makes the original model more complex and hard to train, wherereinforcement learning must be used. In this paper, motivated by theobservation that negative triplets with large scores are important but rare, wepropose to directly keep track of them with the cache. However, how to samplefrom and update the cache are two important questions. We carefully design thesolutions, which are not only efficient but also achieve a good balance betweenexploration and exploitation. In this way, our method acts as a "distilled"version of previous GA-based methods, which does not waste training time onadditional parameters to fit the full distribution of negative triplets. Theextensive experiments show that our method can gain significant improvement invarious KG embedding models, and outperform the state-of-the-art negativesampling methods based on GAN.

 

Quick Read (beta)

loading the full paper ...