A Real-Time System for Egocentric Hand-Object Interaction Detection in Industrial Domains

  • 2025-07-17 17:45:09
  • Antonio Finocchiaro, Alessandro Sebastiano Catinello, Michele Mazzamuto, Rosario Leonardi, Antonino Furnari, Giovanni Maria Farinella
  • 0

Abstract

Hand-object interaction detection remains an open challenge in real-timeapplications, where intuitive user experiences depend on fast and accuratedetection of interactions with surrounding objects. We propose an efficientapproach for detecting hand-objects interactions from streaming egocentricvision that operates in real time. Our approach consists of an actionrecognition module and an object detection module for identifying activeobjects upon confirmed interaction. Our Mamba model with EfficientNetV2 asbackbone for action recognition achieves 38.52% p-AP on the ENIGMA-51 benchmarkat 30fps, while our fine-tuned YOLOWorld reaches 85.13% AP for hand and object.We implement our models in a cascaded architecture where the action recognitionand object detection modules operate sequentially. When the action recognitionpredicts a contact state, it activates the object detection module, which inturn performs inference on the relevant frame to detect and classify the activeobject.

 

Quick Read (beta)

loading the full paper ...