Full text

Turn on search term navigation

© 2022. This work is licensed under https://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.

Abstract

Background: In contrast to all other areas of medicine, psychiatry is still nearly entirely reliant on subjective assessments such as patient self-report and clinical observation. The lack of objective information on which to base clinical decisions can contribute to reduced quality of care. Behavioral health clinicians need objective and reliable patient data to support effective targeted interventions.

Objective: We aimed to investigate whether reliable inferences—psychiatric signs, symptoms, and diagnoses—can be extracted from audiovisual patterns in recorded evaluation interviews of participants with schizophrenia spectrum disorders and bipolar disorder.

Methods: We obtained audiovisual data from 89 participants (mean age 25.3 years; male: 48/89, 53.9%; female: 41/89, 46.1%): individuals with schizophrenia spectrum disorders (n=41), individuals with bipolar disorder (n=21), and healthy volunteers (n=27). We developed machine learning models based on acoustic and facial movement features extracted from participant interviews to predict diagnoses and detect clinician-coded neuropsychiatric symptoms, and we assessed model performance using area under the receiver operating characteristic curve (AUROC) in 5-fold cross-validation.

Results: The model successfully differentiated between schizophrenia spectrum disorders and bipolar disorder (AUROC 0.73) when aggregating face and voice features. Facial action units including cheek-raising muscle (AUROC 0.64) and chin-raising muscle (AUROC 0.74) provided the strongest signal for men. Vocal features, such as energy in the frequency band 1 to 4 kHz (AUROC 0.80) and spectral harmonicity (AUROC 0.78), provided the strongest signal for women. Lip corner–pulling muscle signal discriminated between diagnoses for both men (AUROC 0.61) and women (AUROC 0.62). Several psychiatric signs and symptoms were successfully inferred: blunted affect (AUROC 0.81), avolition (AUROC 0.72), lack of vocal inflection (AUROC 0.71), asociality (AUROC 0.63), and worthlessness (AUROC 0.61).

Conclusions: This study represents advancement in efforts to capitalize on digital data to improve diagnostic assessment and supports the development of a new generation of innovative clinical tools by employing acoustic and facial data analysis.

Details

Title
Acoustic and Facial Features From Clinical Interviews for Machine Learning–Based Psychiatric Diagnosis: Algorithm Development
Author
Birnbaum, Michael L  VIAFID ORCID Logo  ; Abrami, Avner  VIAFID ORCID Logo  ; Heisig, Stephen  VIAFID ORCID Logo  ; Ali, Asra  VIAFID ORCID Logo  ; Arenare, Elizabeth  VIAFID ORCID Logo  ; Agurto, Carla  VIAFID ORCID Logo  ; Lu, Nathaniel  VIAFID ORCID Logo  ; Kane, John M  VIAFID ORCID Logo  ; Cecchi, Guillermo  VIAFID ORCID Logo 
First page
e24699
Section
Diagnostic Tools in Mental Health
Publication year
2022
Publication date
Jan 2022
Publisher
JMIR Publications
e-ISSN
23687959
Source type
Scholarly Journal
Language of publication
English
ProQuest document ID
2624013387
Copyright
© 2022. This work is licensed under https://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.