Abstract
Sign language processing technology development relies on extensive andreliable datasets, instructions, and ethical guidelines. We present acomprehensive Azerbaijani Sign Language Dataset (AzSLD) collected from diversesign language users and linguistic parameters to facilitate advancements insign recognition and translation systems and support the local sign languagecommunity. The dataset was created within the framework of a vision-based AzSLtranslation project. This study introduces the dataset as a summary of thefingerspelling alphabet and sentence- and word-level sign language datasets.The dataset was collected from signers of different ages, genders, and signingstyles, with videos recorded from two camera angles to capture each sign infull detail. This approach ensures robust training and evaluation of gesturerecognition models. AzSLD contains 30,000 videos, each carefully annotated withaccurate sign labels and corresponding linguistic translations. The dataset isaccompanied by technical documentation and source code to facilitate its use intraining and testing. This dataset offers a valuable resource of labeled datafor researchers and developers working on sign language recognition,translation, or synthesis. Ethical guidelines were strictly followed throughoutthe project, with all participants providing informed consent for collecting,publishing, and using the data.