Full text

Turn on search term navigation

© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.

Abstract

Voice changes may be the earliest signs in laryngeal cancer. We investigated whether automated voice signal analysis can be used to distinguish patients with laryngeal cancer from healthy subjects. We extracted features using the software package for speech analysis in phonetics (PRAAT) and calculated the Mel-frequency cepstral coefficients (MFCCs) from voice samples of a vowel sound of /a:/. The proposed method was tested with six algorithms: support vector machine (SVM), extreme gradient boosting (XGBoost), light gradient boosted machine (LGBM), artificial neural network (ANN), one-dimensional convolutional neural network (1D-CNN) and two-dimensional convolutional neural network (2D-CNN). Their performances were evaluated in terms of accuracy, sensitivity, and specificity. The result was compared with human performance. A total of four volunteers, two of whom were trained laryngologists, rated the same files. The 1D-CNN showed the highest accuracy of 85% and sensitivity and sensitivity and specificity levels of 78% and 93%. The two laryngologists achieved accuracy of 69.9% but sensitivity levels of 44%. Automated analysis of voice signals could differentiate subjects with laryngeal cancer from those of healthy subjects with higher diagnostic properties than those performed by the four volunteers.

Details

Title
Convolutional Neural Network Classifies Pathological Voice Change in Laryngeal Cancer with High Accuracy
Author
Kim, HyunBum 1 ; Jeon, Juhyeong 2 ; Han, Yeon Jae 3 ; Joo, YoungHoon 1   VIAFID ORCID Logo  ; Lee, Jonghwan 2 ; Lee, Seungchul 4   VIAFID ORCID Logo  ; Im, Sun 3   VIAFID ORCID Logo 

 Department of Otolaryngology-Head and Neck Surgery, Bucheon St. Mary’s Hospital, College of Medicine, The Catholic University of Korea, Seoul 06591, Korea; [email protected] (H.K.); [email protected] (Y.J.) 
 Department of Mechanical Engineering, Pohang University of Science and Technology (POSTECH), Pohang 37673, Korea; [email protected] (J.J.); [email protected] (J.L.) 
 Department of Rehabilitation Medicine, Bucheon St. Mary’s Hospital, College of Medicine, The Catholic University of Korea, Seoul 06591, Korea; [email protected] 
 Department of Mechanical Engineering, Pohang University of Science and Technology (POSTECH), Pohang 37673, Korea; [email protected] (J.J.); [email protected] (J.L.); Graduate School of Artificial Intelligence, Pohang University of Science and Technology (POSTECH), Pohang 37673, Korea 
First page
3415
Publication year
2020
Publication date
2020
Publisher
MDPI AG
e-ISSN
20770383
Source type
Scholarly Journal
Language of publication
English
ProQuest document ID
2641049726
Copyright
© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.