Automatic Speech Segmentation Based On Audio and

Abstract

Automatic speech segmentation as an important part of speech recognition system (ASR) is highly noise dependent. Noise is made by changes in the communication channel, background, level of speaking etc. In recent years, many researchers have proposed noise cancelation techniques and have added visual features from speaker's face to reduce the effect of noise on ASR systems. Removing noise from audio signals depends on the type of the noise; so it cannot be used as a general solution. Adding visual features improve this lack of efficiency, but advanced methods of this type need manual extraction of visual features. In this paper we propose a completely automatic system which uses optical flow vectors from speaker's image sequence to obtain visual features. Then, Hidden Markov Models are trained to segment audio signals from image sequences and audio features based on extracted optical flow. The developed segmentation system based on such method acts totally automatic and become more robust to noise.

Details

Title

Automatic Speech Segmentation Based On Audio and Optical Flow Visual Classification

Author

Torabi, Behnam; Ahmad Reza Naghsh Nilchi

Pages

43-49

Publication year

2014

Publication date

Oct 2014

Publisher

Modern Education and Computer Science Press

ISSN

20749074

e-ISSN

20749082

Source type

Scholarly Journal

Language of publication

English

DOI

https://doi.org/10.5815/ijigsp.2014.11.06

ProQuest document ID

1626456616

Automatic Speech Segmentation Based On Audio and Optical Flow Visual Classification

Jump to:

Abstract

Details

Suggested sources