SOWA: Adapting Hierarchical Frozen Window Self-Attention to Visual-Language Models for Better Anomaly Detection

  • 2024-07-04 05:54:03
  • Zongxiang Hu, Zhaosheng Zhang
  • 0

Abstract

Visual anomaly detection is critical in industrial manufacturing, buttraditional methods often rely on extensive normal datasets and custom models,limiting scalability. Recent advancements in large-scale visual-language modelshave significantly improved zero/few-shot anomaly detection. However, theseapproaches may not fully utilize hierarchical features, potentially missingnuanced details. We introduce a window self-attention mechanism based on theCLIP model, combined with learnable prompts to process multi-level featureswithin a Soldier-Offier Window self-Attention (SOWA) framework. Our method hasbeen tested on five benchmark datasets, demonstrating superior performance byleading in 18 out of 20 metrics compared to existing state-of-the-arttechniques.

 

Quick Read (beta)

loading the full paper ...