MVP: Multi-source Voice Pathology detection

  • 2025-05-26 15:38:35
  • Alkis Koudounas, Moreno La Quatra, Gabriele Ciravegna, Marco Fantini, Erika Crosetti, Giovanni Succo, Tania Cerquitelli, Sabato Marco Siniscalchi, Elena Baralis
  • 0

Abstract

Voice disorders significantly impact patient quality of life, yetnon-invasive automated diagnosis remains under-explored due to both thescarcity of pathological voice data, and the variability in recording sources.This work introduces MVP (Multi-source Voice Pathology detection), a novelapproach that leverages transformers operating directly on raw voice signals.We explore three fusion strategies to combine sentence reading and sustainedvowel recordings: waveform concatenation, intermediate feature fusion, anddecision-level combination. Empirical validation across the German, Portuguese,and Italian languages shows that intermediate feature fusion using transformersbest captures the complementary characteristics of both recording types. Ourapproach achieves up to +13% AUC improvement over single-source methods.

 

Quick Read (beta)

loading the full paper ...