Knowledge Aided Consistency for Weakly Supervised Phrase Grounding

  • 2018-03-11 02:00:24
  • Kan Chen, Jiyang Gao, Ram Nevatia
  • 2

Abstract

Given a natural language query, a phrase grounding system aims to localizementioned objects in an image. In weakly supervised scenario, mapping betweenimage regions (i.e., proposals) and language is not available in the trainingset. Previous methods address this deficiency by training a grounding systemvia learning to reconstruct language information contained in input queriesfrom predicted proposals. However, the optimization is solely guided by thereconstruction loss from the language modality, and ignores rich visualinformation contained in proposals and useful cues from external knowledge. Inthis paper, we explore the consistency contained in both visual and languagemodalities, and leverage complementary external knowledge to facilitate weaklysupervised grounding. We propose a novel Knowledge Aided Consistency Network(KAC Net) which is optimized by reconstructing input query and proposal'sinformation. To leverage complementary knowledge contained in the visualfeatures, we introduce a Knowledge Based Pooling (KBP) gate to focus onquery-related proposals. Experiments show that KAC Net provides a significantimprovement on two popular datasets.

 

Quick Read (beta)

loading the full paper ...