TY - CONF
T1 - Multilingual Depression Detection Based on Speech Signals and Deep Learning
T2 - 2024 IEEE 10th International Conference on Big Data Computing Service and Machine Learning Applications (BigDataService)
AU - Liu, Lidan
AU - Tydeman, F.
AU - Xie, Wanqing
AU - Wang, Y.
N1 - Publisher Copyright:
© 2024 IEEE.
PY - 2024/10/29
Y1 - 2024/10/29
N2 - Current assessments for depressive disorder are often influenced by cognitive function making them more susceptible to biases. Deep learning could provide more objective diagnoses with less access barriers for individuals who are unable to complete traditional assessments. In our study, we aim to explore the relations among speech, languages, and depression to demonstrate the feasibility of multi-lingual speech depression detection, and then build deep learning models using multi-lingual speech samples to support depression diagnosis. We first used a newly collected Chinese speech depression dataset to build a convolutional neural network (CNN) to conduct depression detection, and the accuracy of the test set reached 0.85. Besides, we tested the English depression speech dataset, DAIC-WOZ, using the same CNN model, and the accuracy of the test set was 0.73. While training the model using both Chinese and English speech samples and testing on mixture speeches, the accuracy achieved 0.74. We found that the CNN model can be applied across languages with a relatively stable performance of depression detection. This provides evidence that it is possible to develop a language-independent depression detection tool to support depression diagnostic and achieve worldwide long-term mental health monitoring.
AB - Current assessments for depressive disorder are often influenced by cognitive function making them more susceptible to biases. Deep learning could provide more objective diagnoses with less access barriers for individuals who are unable to complete traditional assessments. In our study, we aim to explore the relations among speech, languages, and depression to demonstrate the feasibility of multi-lingual speech depression detection, and then build deep learning models using multi-lingual speech samples to support depression diagnosis. We first used a newly collected Chinese speech depression dataset to build a convolutional neural network (CNN) to conduct depression detection, and the accuracy of the test set reached 0.85. Besides, we tested the English depression speech dataset, DAIC-WOZ, using the same CNN model, and the accuracy of the test set was 0.73. While training the model using both Chinese and English speech samples and testing on mixture speeches, the accuracy achieved 0.74. We found that the CNN model can be applied across languages with a relatively stable performance of depression detection. This provides evidence that it is possible to develop a language-independent depression detection tool to support depression diagnostic and achieve worldwide long-term mental health monitoring.
UR - http://www.scopus.com/inward/record.url?scp=85209675847&partnerID=8YFLogxK
U2 - 10.1109/BigDataService62917.2024.00024
DO - 10.1109/BigDataService62917.2024.00024
M3 - Paper
SP - 115
EP - 116
ER -