Full text

Turn on search term navigation

Copyright © 2022 Abeer Ali Alnuaim et al. This work is licensed under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.

Abstract

Several speaker recognition algorithms failed to get the best results because of the wildly varying datasets and feature sets for classification. Gender information helps reduce this effort since categorizing the classes based on gender may help lessen the impact of gender variability on the retrieved features. This study attempted to construct a perfect classification model for language-independent gender identification utilizing the Common Voice dataset (Mozilla). Most previous studies are doing manual extracting characteristics and feeding them into a machine learning model for categorization. Deep neural networks (DNN) were the most effective strategy in our research. Nonetheless, the main goal was to take advantage of the wealth of information included in voice data without requiring significant manual intervention. We trained the deep learning network to choose essential information from speech spectrograms for the classification layer, performing gender detection. The pretrained ResNet 50 fine-tuned gender data successfully achieved an accuracy of 98.57% better than the traditional ML approaches and the previous works reported with the same dataset. Furthermore, the model performs well on additional datasets, demonstrating the approach’s generalization capacity.

Details

Title
Speaker Gender Recognition Based on Deep Neural Networks and ResNet50
Author
Abeer Ali Alnuaim 1   VIAFID ORCID Logo  ; Zakariah, Mohammed 2   VIAFID ORCID Logo  ; Shashidhar, Chitra 3 ; Wesam Atef Hatamleh 4   VIAFID ORCID Logo  ; Tarazi, Hussam 5   VIAFID ORCID Logo  ; Shukla, Prashant Kumar 6 ; Ratna, Rajnish 7   VIAFID ORCID Logo 

 Department of Computer Science and Engineering, College of Applied Studies and Community Services, King Saud University, P.O. Box 22459, Riyadh 11495, Saudi Arabia 
 College of Computer and Information Sciences, King Saud University, P.O. Box 51178, Riyadh 11543, Saudi Arabia 
 Department of Commerce and Management, Seshadripuram College, Seshadripuram, Bengaluru 20, India 
 Department of Computer Science, College of Computer and Information Sciences, King Saud University, P.O. Box 51178, Riyadh 11543, Saudi Arabia 
 Department of Computer Science and Informatics, School of Engineering and Computer Science, Oakland University, Rochester Hills, MI, 318 Meadow Brook Rd, Rochester, MI 48309, USA 
 - Department of Computer Science and Engineering, Koneru Lakshmaiah Education Foundation, Vaddeswaram, Guntur, 522502 Andhra Pradesh, India 
 Gedu College of Business Studies, Royal University of Bhutan, Bhutan 
Editor
Mohammad Farukh Hashmi
Publication year
2022
Publication date
2022
Publisher
John Wiley & Sons, Inc.
e-ISSN
15308677
Source type
Scholarly Journal
Language of publication
English
ProQuest document ID
2646636412
Copyright
Copyright © 2022 Abeer Ali Alnuaim et al. This work is licensed under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.