Abstract

Translate

We are developing two crucial improvements on the time-frequency masking approach to the blind speech separation of underdetermined mixtures when processing anechoic and echoic mixtures. First, the proposed method copes with the usually large amount of delay estimation error that appears in a low frequency band. This step generates a restrictive mask for phase delays on the basis of local and global energy distribution analysis. This mask allows the selected cells to contribute to the orientation histogram. Second, the strong WDO assumption (disjoint orthogonal frequency domain) is relaxed by allowing some frequency bins to be shared by both sources. By detecting fundamental frequencies of speakers at instantaneous time points, mask creation is supported by exploring their harmonic frequencies. The proposed method is proved to be effective and reliable in conducting experiments with both simulated

Details

Title

Speaker Localization And Speech Separationin Two Echoic Mixtures

Author

Kasprzak, Wlodzimierz; Ding, Ning; Hamada, Nozomu

Pages

n/a

Section

Articles

Publication year

2011

Publication date

2011

Publisher

Vilnius Gediminas Technical University, Department of Construction Economics & Property

ISSN

20292341

e-ISSN

20292252

Source type

Scholarly Journal

Language of publication

English

DOI

https://doi.org/10.3846/mla.2011.009

ProQuest document ID

906705255

Speaker Localization And Speech Separationin Two Echoic Mixtures

Jump to:

Abstract

Details

Suggested sources