Content area
Speech impediments affect verbal and nonverbal communication, leading individuals to rely on sign language and alternative methods. However, non-signers struggle to communicate due to a lack of sign language knowledge. Recent advancements in deep learning and computer vision have improved gesture recognition, enabling the development of innovative solutions for sign language translation. This project proposes a computer vision-based deep learning application that translates sign language gestures into text, enhancing communication between signers and non-signers. It uses video sequences to extract spatial and temporal information, employing a Convolutional Neural Network (CNN) for depth and point data processing, along with a Gated Recurrent Unit (GRU) for improved temporal feature extraction. Temporal tokenization further refines feature representation, ensuring efficient resource utilization. The system is trained on the Word-Level American Sign Language (WLASL) dataset, the largest publicly available ASL dataset, containing over 2,000 words signed by more than 100 individuals. The model accurately recognizes 20 gestures with 94% accuracy. The final implementation is a web application that delivers real-time text translation, fostering seamless communication between signers and non-signers and addressing accessibility challenges for individuals with speech impairments.
Details
Datasets;
Accuracy;
Data processing;
Deep learning;
Sign language;
Applications programs;
Communication;
Computer vision;
Artificial neural networks;
Neural networks;
Gesture recognition;
Hearing loss;
Natural language processing;
Deafness;
Machine learning;
Resource utilization;
Real time;
Language translation;
Speech;
Speech recognition
1 Department of CSE-AI, Chalapathi Institute of Engineering and Technology, LAM, Guntur, Andhra Pradesh, India.