Facial Affect Recognition based on Multi Architecture Encoder and Feature Fusion for the ABAW7 Challenge

Abstract

In this paper, we present our approach to addressing the challenges of the7th ABAW competition. The competition comprises three sub-challenges: ValenceArousal (VA) estimation, Expression (Expr) classification, and Action Unit (AU)detection. To tackle these challenges, we employ state-of-the-art models toextract powerful visual features. Subsequently, a Transformer Encoder isutilized to integrate these features for the VA, Expr, and AU sub-challenges.To mitigate the impact of varying feature dimensions, we introduce an affinemodule to align the features to a common dimension. Overall, our resultssignificantly outperform the baselines.

Quick Read (beta)

loading the full paper ...