Retrieval over Classification: Integrating Relation Semantics for Multimodal Relation Extraction

  • 2025-09-25 13:38:38
  • Lei Hei, Tingjing Liao, Yingxin Pei, Yiyang Qi, Jiaqi Wang, Ruiting Li, Feiliang Ren
  • 0

Abstract

Relation extraction (RE) aims to identify semantic relations between entitiesin unstructured text. Although recent work extends traditional RE to multimodalscenarios, most approaches still adopt classification-based paradigms withfused multimodal features, representing relations as discrete labels. Thisparadigm has two significant limitations: (1) it overlooks structuralconstraints like entity types and positional cues, and (2) it lacks semanticexpressiveness for fine-grained relation understanding. We propose\underline{R}etrieval \underline{O}ver \underline{C}lassification (ROC), anovel framework that reformulates multimodal RE as a retrieval task driven byrelation semantics. ROC integrates entity type and positional informationthrough a multimodal encoder, expands relation labels into natural languagedescriptions using a large language model, and aligns entity-relation pairs viasemantic similarity-based contrastive learning. Experiments show that ourmethod achieves state-of-the-art performance on the benchmark datasets MNRE andMORE and exhibits stronger robustness and interpretability.

 

Quick Read (beta)

loading the full paper ...