Learning from What is Already Out There: Few-shot Sign Language Recognition with Online Dictionaries

  • 2023-01-10 03:21:01
  • Matyáš Boháček, Marek Hrúz
  • 0


Today's sign language recognition models require large training corpora oflaboratory-like videos, whose collection involves an extensive workforce andfinancial resources. As a result, only a handful of such systems are publiclyavailable, not to mention their limited localization capabilities forless-populated sign languages. Utilizing online text-to-video dictionaries,which inherently hold annotated data of various attributes and sign languages,and training models in a few-shot fashion hence poses a promising path for thedemocratization of this technology. In this work, we collect and open-sourcethe UWB-SL-Wild few-shot dataset, the first of its kind training resourceconsisting of dictionary-scraped videos. This dataset represents the actualdistribution and characteristics of available online sign language data. Weselect glosses that directly overlap with the already existing datasetsWLASL100 and ASLLVD and share their class mappings to allow for transferlearning experiments. Apart from providing baseline results on a pose-basedarchitecture, we introduce a novel approach to training sign languagerecognition models in a few-shot scenario, resulting in state-of-the-artresults on ASLLVD-Skeleton and ASLLVD-Skeleton-20 datasets with top-1 accuracyof $30.97~\%$ and $95.45~\%$, respectively.


Quick Read (beta)

loading the full paper ...