Even though most people now have better access to education because of technological advancements, 70 million people with hearing impairment still cannot use the majority of these tools. Our project aims to meet SDG goal 4 through minimizing this gap and ensuring inclusive and equitable quality education, thus promoting lifelong learning opportunities for all. Through our platform Bornil, we will enable creating video-based Sign Language dataset for all the 300+ low resource sign language dialects. This dataset will be used for Automated Sign Language Recognition (ASLR) for these dialects so that people using these dialects can communicate efficiently. Currently there are no standardized protocols for crowdsourcing sign language dataset construction, and data acquisition is highly expensive. To democratize the data collection process and expedite development of ASLR, we are offering this open source platform that will address multiple data collection process factors. We will consider explicit age and gender-based differentials as our platform will enable crowdsourcing of sign language recordings using readily available devices like phones and laptops, without the need of setting up recording studios. We will enable gathering data from a variety of sources, including the deaf community, hard of hearing persons, children of deaf parents, and siblings of deaf adults using an intuitive platform interface. To offer real-world representative data, voluntary data contributors will assist by adjusting camera resolution, background noise and other metadata. GDPR will be ensured throughout the process. Users will contribute to the datasets in three ways: by recording videos, validating recordings and metadata, and annotating videos. We will curate this data in such a way that artificial intelligence algorithms can be used for training ASLR systems. As a case study, we will use this platform for investigating statistical variety of the participants and regional variations in Bangladeshi Sign Language.
Loading...
The recording section provides users with an in-built video recorder. The users are given a text (one or more sentences or a topic) based on their selected language for which they will record a video in sign language. Users will also submit metadata related to the recordings. After recording the video, the platform automatically switches to a video player form where a user can preview the recording.
In this section, users is given an unannotated video and its corresponding text/topic. User is given an interface with text boxes and timestamps which (s)he will use for the annotation. The platform provides a timeline with drag-and-drop feature, allowing the user to freely change the length, start time, end time of the text boxes and also seek the video to a specific time.
The validation section is used for validating the recorded videos, their metadata and annotation. Metadata validation is similar to recording where a user watches the video to validate if the recording and the provided info is correct or not. As for annotation validation, users validate if the annotation matches the video. In both sections users are able to fix any errors.
Mohammad Akhlaqur Rahman
Department of Software Engineering, IICT, SUST
Shahriar Elahi Dhruvo
Department of Software Engineering, IICT, SUST
Farig Sadeque
BengaliAI Co-Principal Investigator, Asst. Prof BracU
Sabbir Ahmed Chowdhury
Assistant Professor, IER, Dhaka University & PhD Researcher, UWS, UK
Asif Sushmit
Coordinator, BengaliAI Co-Principal Investigator, Dataset PhD Student, RPI
Rezwana Sultana
Coordinator, Bengali.AI (Attached Officer, A2I, ICT Division, Bangladesh)