Full Text

Turn on search term navigation

Copyright Universidad de Tarapacá 2011

Abstract

We present the development of an automatic audiovisual speech recognition system focused on the recognition of commands. Signal audio representation was done using Mel cepstral coefficients and their first and second order time derivatives. In order to characterize the video signal, a set of high-level visual features was tracked throughout the sequences. Automatic initialization of the algorithm was performed using color transformations and active contour models based on Gradient Vector Flow (GVF Snakes) on the lip region, whereas visual tracking used similarity measures across neighborhoods and morphological restrictions defined on MPEG-4 standard. First of all, we show the design process for an isolated word audio speech recognition system (ASR) using Hidden Markov Models. Next, we show the design process for a speech recognition system using only video features (VSR,) and both audio and video features combined (AVSR). Finally, we compare the results of the three systems on our database in Spanish and French language, showing that AVSR outperforms AVR and VSR under increased acoustic noise conditions in the sequences. [PUBLICATION ABSTRACT]

Details

Title
Sistema audiovisual para reconocimiento de comandos/Audiovisual system for recognition of commands
Author
Ceballos, Alexander; Serna-Morales, Andrés F; Prieto, Flavio; Gómez, Juan B; Redarce, Tanneguy
Pages
278-291
Publication year
2011
Publication date
2011
Publisher
Universidad de Tarapacá
ISSN
07183291
e-ISSN
07183305
Source type
Scholarly Journal
Language of publication
Spanish
ProQuest document ID
906290348
Copyright
Copyright Universidad de Tarapacá 2011