Abstract

This paper analyses the performance of different types of Deep Neural Networks to jointly estimate age and identify gender from speech, to be applied in Interactive Voice Response systems available in call centres. Deep Neural Networks are used, because they have recently demonstrated discriminative and representation capabilities in a wide range of applications, including speech processing problems based on feature extraction and selection. Networks with different sizes are analysed to obtain information on how performance depends on the network architecture and the number of free parameters. The speech corpus used for the experiments is Mozilla’s Common Voice dataset, an open and crowdsourced speech corpus. The results are really good for gender classification, independently of the type of neural network, but improve with the network size. Regarding the classification by age groups, the combination of convolutional neural networks and temporal neural networks seems to be the best option among the analysed, and again, the larger the size of the network, the better the results. The results are promising for use in IVR systems, with the best systems achieving a gender identification error of less than 2% and a classification error by age group of less than 20%.

Details

Title
Age group classification and gender recognition from speech with temporal convolutional neural networks
Author
Sánchez-Hevia, Héctor A 1 ; Gil-Pita, Roberto 1 ; Utrilla-Manso, Manuel 1 ; Rosa-Zurera, Manuel 1   VIAFID ORCID Logo 

 University of Alcalá, Signal Theory and Communications Department, Madrid, Spain (GRID:grid.7159.a) (ISNI:0000 0004 1937 0239) 
Pages
3535-3552
Publication year
2022
Publication date
Jan 2022
Publisher
Springer Nature B.V.
ISSN
13807501
e-ISSN
15737721
Source type
Scholarly Journal
Language of publication
English
ProQuest document ID
2631382773
Copyright
© The Author(s) 2021. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.