Multimodal Structure-Aware Quantum Data Processing

  • 2024-11-11 10:03:47
  • Hala Hawashin, Mehrnoosh Sadrzadeh
  • 0

Abstract

While large language models (LLMs) have advanced the field of naturallanguage processing (NLP), their "black box" nature obscures theirdecision-making processes. To address this, researchers developed structuredapproaches using higher order tensors. These are able to model linguisticrelations, but stall when training on classical computers due to theirexcessive size. Tensors are natural inhabitants of quantum systems and trainingon quantum computers provides a solution by translating text to variationalquantum circuits. In this paper, we develop MultiQ-NLP: a framework forstructure-aware data processing with multimodal text+image data. Here,"structure" refers to syntactic and grammatical relationships in language, aswell as the hierarchical organization of visual elements in images. We enrichthe translation with new types and type homomorphisms and develop novelarchitectures to represent structure. When tested on a main stream imageclassification task (SVO Probes), our best model showed a par performance withthe state of the art classical models; moreover the best model was fullystructured.

 

Quick Read (beta)

loading the full paper ...