Open Information Extraction with Global Structure Constraints

  • 2018-05-15 16:01:40
  • Qi Zhu, Xiang Ren, Jingbo Shang, Yu Zhang, Frank F. Xu, Jiawei Han
Extracting entities and their relations from text is an important task forunderstanding massive text corpora. Open information extraction (IE) systemsmine relation tuples (i.e., entity arguments and a predicate string to describetheir relation) from sentences. However, current open IE systems ignore thefact that global statistics in a large corpus can be collectively leveraged toidentify high-quality sentence-level extractions. In this paper, we propose anovel open IE system, called ReMine, which integrates local context signal andglobal structural signal in a unified framework with distant supervision. Thenew system can be efficiently applied to different domains as it uses factsfrom external knowledge bases as supervision; and can effectively scoresentence-level tuple extractions based on corpus-level statistics.Specifically, we design a joint optimization problem to unify (1) segmentingentity/relation phrases in individual sentences based on local context; and (2)measuring the quality of sentence-level extractions with a translating-basedobjective. Experiments on real-world corpora from different domains demonstratethe effectiveness and robustness of ReMine when compared to other open IEsystems.


