Discriminative Training of Acoustic Models for Mispronunciation Detection and Diagnosis of Non-native English

Abstract

This thesis applies discriminative training techniques to improve the acoustic modeling in mispronunciation detection and diagnosis (MD&D) for computer-aided pronunciation training. Discriminative training of generative models improves classification performance by bringing in competing classes and optimizes a task-relevant evaluation criterion to tune the decision boundaries, as is done in discriminative models by nature. This work formulates and optimizes discriminative training criteria for generative GMM-HMMs in two broad frameworks of MD&D The first framework explicitly models the phonetic error patterns from a labelled non-native speech corpus and populates the recognition network with the extracted and predicted error patterns. Discriminative training of GMM-HMMs by minimizing the expected full-sequence word-level errors brings down the word-level error by 16% relative. Nevertheless, explicit error pattern modeling suffers from missing error patterns and inclusion of rare and idiosyncratic ones. In addition, a balance has to be stroke between under-generation and over-generation of error patterns. The second and recently-proposed framework seeks to abandon explicit error pattern modeling by instantiating a set of anti-phones and a filler model with GMM-HMMs, and crafts general phone error detection and diagnosis networks that encompasses all possible errors. This design renders explicit error pattern modeling unnecessary. In the two-pass framework, discriminative training of GMM-HMMs by minimizing the full-sequence phone-level errors lowers the phone-level error by 40% relative. Visualization of the GMM parameters shows that discriminative training effectively separates the canonical phones and their anti-phones.

Details

Subject

Electrical engineering

Classification

0544: Electrical engineering

Identifier / keyword

Applied sciences; Discriminative training; Mispronunciation detection and diagnosis; Pronunciation training; Speech recongnition

Title

Discriminative Training of Acoustic Models for Mispronunciation Detection and Diagnosis of Non-native English

Author

Qian, Xiaojun

Number of pages

111

Degree date

2015

School code

1307

Source

DAI-B 78/05(E), Dissertation Abstracts International

ISBN

978-1-369-41027-3

Advisor

Memg, Mei Ling Helen

University/institution

The Chinese University of Hong Kong (Hong Kong)

University location

Hong Kong

Degree

Ph.D.

Source type

Dissertation or Thesis

Language

English

Document type

Dissertation/Thesis

Dissertation/thesis number

10297322

ProQuest document ID

1846478052

Document URL

https://www.proquest.com/dissertations-theses/discriminative-training-acoustic-models/docview/1846478052/se-2?accountid=208611

Database copyright ProQuest LLC; ProQuest does not claim copyright in the individual underlying works.

Database

ProQuest One Academic

Discriminative Training of Acoustic Models for Mispronunciation Detection and Diagnosis of Non-native English

Content area

Abstract

Details