OpenHands: Making Sign Language Recognition Accessible with Pose-based Pretrained Models across Languages

Abstract

AI technologies for Natural Languages have made tremendous progress recently.However, commensurate progress has not been made on Sign Languages, inparticular, in recognizing signs as individual words or as complete sentences.We introduce OpenHands, a library where we take four key ideas from the NLPcommunity for low-resource languages and apply them to sign languages forword-level recognition. First, we propose using pose extracted throughpretrained models as the standard modality of data to reduce training time andenable efficient inference, and we release standardized pose datasets for 6different sign languages - American, Argentinian, Chinese, Greek, Indian, andTurkish. Second, we train and release checkpoints of 4 pose-based isolated signlanguage recognition models across all 6 languages, providing baselines andready checkpoints for deployment. Third, to address the lack of labelled data,we propose self-supervised pretraining on unlabelled data. We curate andrelease the largest pose-based pretraining dataset on Indian Sign Language(Indian-SL). Fourth, we compare different pretraining strategies and for thefirst time establish that pretraining is effective for sign languagerecognition by demonstrating (a) improved fine-tuning performance especially inlow-resource settings, and (b) high crosslingual transfer from Indian-SL to fewother sign languages. We open-source all models and datasets in OpenHands witha hope that it makes research in sign languages more accessible, available hereat https://github.com/AI4Bharat/OpenHands .

Quick Read (beta)

loading the full paper ...