Content area

Abstract

This research proposes an advanced approach that integrates multiple image detection techniques and natural language processing (NLP) methodologies for English-to-Spanish language translation. The developed software accepts an image as input, which undergoes preprocessing using adaptive thresholding, morphological transformations, and edge detection algorithms such as Canny and Sobel operators to enhance text clarity. Text detection and localization are achieved using the EfficientDet and EAST (Efficient and Accurate Scene Text) detector frameworks, followed by Optical Character Recognition (OCR) using PyTesseract, a wrapper for Google’s Tesseract OCR. The detected text is passed to an NLP system for translation, which employs a sequence-to-sequence transformer model implemented with Keras, TensorFlow, and NumPy. Additional techniques, such as Byte Pair Encoding (BPE) for text tokenization and positional encoding for transformer-based attention, improve translation efficiency. An English-Spanish dictionary from Anki and a large parallel corpus dataset were used for training. The NLP pipeline leverages semantic analysis, part-of-speech tagging, and dependency parsing to preserve grammatical structure and context. Fine-tuning the transformer model parameters, including learning rate scheduling and gradient clipping, further optimized system performance. The research demonstrates a 93.7% translation accuracy, achieved by combining state-of-the-art image processing algorithms, advanced transformer architectures, and a robust training dataset. This hybrid approach significantly improves the accuracy of English-to-Spanish translations, validating the effectiveness of integrating computer vision and NLP technologies.

Article highlights

Enhanced translation accuracy—AI-powered translation models improve English-to-Spanish accuracy by preserving grammar, sentence structure, and contextual meaning. The integration of deep learning and NLP ensures precise translations across various text types.

Optimized image-to-text extraction—advanced OCR techniques, including adaptive thresholding, morphological transformations, and deep learning-based text detection, enhance the accuracy of extracting text from images, even in complex backgrounds or poor lighting conditions.

Scalable AI solutions for real-world applications—the combination of computer vision and NLP enables practical applications in multilingual communication, document translation, accessibility for visually impaired users, and real-time text recognition for travel, business, and education. The AI-driven approach ensures scalability across diverse environments and languages.

Details

1009240
Business indexing term
Title
Breaking language barriers with image detection and natural language processing model for English to Spanish translation
Author
Salman, Bakhita 1 ; Lopez, Andres 1 ; Delapena, Nathanielle 1 

 Texas A&M International University, School of Engineering, Laredo, USA (GRID:grid.264755.7) (ISNI:0000 0000 8747 9982) 
Publication title
Volume
7
Issue
7
Pages
682
Publication year
2025
Publication date
Jul 2025
Publisher
Springer Nature B.V.
Place of publication
London
Country of publication
Netherlands
Publication subject
ISSN
25233963
e-ISSN
25233971
Source type
Scholarly Journal
Language of publication
English
Document type
Journal Article
Publication history
 
 
Online publication date
2025-07-01
Milestone dates
2025-04-07 (Registration); 2025-01-08 (Received); 2025-04-07 (Accepted)
Publication history
 
 
   First posting date
01 Jul 2025
ProQuest document ID
3226286529
Document URL
https://www.proquest.com/scholarly-journals/breaking-language-barriers-with-image-detection/docview/3226286529/se-2?accountid=208611
Copyright
Copyright Springer Nature B.V. Jul 2025
Last updated
2025-07-02
Database
ProQuest One Academic