Full Text

Turn on search term navigation

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.

Abstract

Voiceprint recognition systems often face challenges related to limited and diverse datasets, which hinder their performance and generalization capabilities. This study proposes a novel approach that integrates generative adversarial networks (GANs) for data augmentation and convolutional neural networks (CNNs) with mel-frequency cepstral coefficients (MFCCs) for voiceprint classification. Experimental results demonstrate that the proposed methodology improves recognition accuracy by up to 15% in low-resource scenarios. The optimal ratio of real-to-GAN-generated samples was determined to be 3:2, which balanced dataset diversity and model performance. In specific cases, the model achieved an accuracy of 96.6%, showcasing its effectiveness in capturing unique voice characteristics while mitigating overfitting. These results highlight the potential of combining GAN-augmented data and CNN-based classification to enhance voiceprint recognition in diverse and resource-constrained environments.

Details

Title
Data Augmentation for Voiceprint Recognition Using Generative Adversarial Networks
Author
Yao-San, Lin 1   VIAFID ORCID Logo  ; Hung-Yu, Chen 2   VIAFID ORCID Logo  ; Huang, Mei-Ling 1   VIAFID ORCID Logo  ; Tsung-Yu Hsieh 1 

 Department of Industrial Engineering and Management, National Chin-Yi University of Technology, No. 57, Sec. 2, Zhongshan Rd., Taiping Dist., Taichung 411, Taiwan; [email protected] 
 Department of Information Management, National Chin-Yi University of Technology, No. 57, Sec. 2, Zhongshan Rd., Taiping Dist., Taichung 411, Taiwan; [email protected] 
First page
583
Publication year
2024
Publication date
2024
Publisher
MDPI AG
e-ISSN
19994893
Source type
Scholarly Journal
Language of publication
English
ProQuest document ID
3149503664
Copyright
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.