Abstract
Introduction: Speech, and language-based tools, offer a promising, accessible solution for real-time assessment of depression symptom severity [1]. Such tools have the potential to help clinicians in mental health services, with severe staff shortages and long waiting lists delaying Major Depressive Disorder (MDD) diagnosis and treatment [2]. To date, insights into speech and language biomarkers associated with changes in MDD are limited, in that they have mainly been drawn from non-clinical cross-sectional studies. Utilising a clinical speech dataset [3], this work had two aims: (i) exploring associations between lexical features and MDD symptom severity; and (ii) evaluating the efficacy of combinations of lexical features and text embeddings to predict MDD symptom severity.
Corpus: We utilised data from the Remote Assessment of Disease and Relapse in Major Depressive Disorder (RADAR-MDD) study. RADAR-MDD was a longitudinal cohort study examining the utility of multi-parametric remote measurement technologies (RMT), including speech, to measure symptom changes and predict relapse in people with MDD [4]. During the study, participants were asked to record a response to the question “Can you describe something you are looking forward to this week” [3]. The full eligibility and exclusion criteria for RADAR-MDD are detailed in [4]. Ethical approval was obtained from the Camberwell St. Giles Research Ethics Committee (17/LO/1154) in London. A patient advisory board co-developed the RADAR-MDD protocol with input on several aspects of the study, including the speech tasks.
Methods: Lexical features were extracted from automated transcriptions of RADAR-MDD speech recordings. The association between these features and depression symptom severity, as measured using the 8-item Patient Health Questionnaire (PHQ-8), were determined using mixed-effects models. Additionally, machine learning regressors were trained via nested-cross-fold validation, to predict PHQ-8 using lexical features, Term Frequency-Inverse Document Frequency (TF-IDF) and RoBERTa embeddings [5]. A sensitivity analysis was conducted using transcripts from one open-source and one commercial Automatic Speech Recognition (ASR) tool, Whisper [6] and Speechmatics.
Results: Our dataset contains 3,963 recordings from 350 individuals (M: 907, F: 3,056, PHQ-8 mean: 8.94). Mixed effects modelling indicated robust associations between several lexical features and MDD symptom severity (Figure 1). Our machine learning model results, particularly low R2 values, indicate potential overfitting (Table 1). Our sensitivity analysis demonstrated consistent results across different ASR models.
Discussion: We observed several expected associations between depression severity and our lexical features. The lack of trend for past-focus word frequency and first-person pronoun frequency could be attributed to the prompt phrasing. Future work is required to identify a more suitable feature-model combination to improve our prediction performance.
References
[1] J. Robin, et al. 2020, doi: 10.1159/000510820.
[2] M. Deakin, et al., 2022, doi: 10.1136/BMJ.O945.
[3] N. Cummins, et al., 2023, doi: 10.1016/j.jad.2023.08.097.
[4] F. Matcham, et al., 2019, doi: 10.1186/s12888-019-2049-z
[5] Y. Liu et al., 2019, doi: h10.48550/arXiv.1907.11692
[6] A. Radford et al., 2022, doi: 10.48550/arXiv.2212.04356
Corpus: We utilised data from the Remote Assessment of Disease and Relapse in Major Depressive Disorder (RADAR-MDD) study. RADAR-MDD was a longitudinal cohort study examining the utility of multi-parametric remote measurement technologies (RMT), including speech, to measure symptom changes and predict relapse in people with MDD [4]. During the study, participants were asked to record a response to the question “Can you describe something you are looking forward to this week” [3]. The full eligibility and exclusion criteria for RADAR-MDD are detailed in [4]. Ethical approval was obtained from the Camberwell St. Giles Research Ethics Committee (17/LO/1154) in London. A patient advisory board co-developed the RADAR-MDD protocol with input on several aspects of the study, including the speech tasks.
Methods: Lexical features were extracted from automated transcriptions of RADAR-MDD speech recordings. The association between these features and depression symptom severity, as measured using the 8-item Patient Health Questionnaire (PHQ-8), were determined using mixed-effects models. Additionally, machine learning regressors were trained via nested-cross-fold validation, to predict PHQ-8 using lexical features, Term Frequency-Inverse Document Frequency (TF-IDF) and RoBERTa embeddings [5]. A sensitivity analysis was conducted using transcripts from one open-source and one commercial Automatic Speech Recognition (ASR) tool, Whisper [6] and Speechmatics.
Results: Our dataset contains 3,963 recordings from 350 individuals (M: 907, F: 3,056, PHQ-8 mean: 8.94). Mixed effects modelling indicated robust associations between several lexical features and MDD symptom severity (Figure 1). Our machine learning model results, particularly low R2 values, indicate potential overfitting (Table 1). Our sensitivity analysis demonstrated consistent results across different ASR models.
Discussion: We observed several expected associations between depression severity and our lexical features. The lack of trend for past-focus word frequency and first-person pronoun frequency could be attributed to the prompt phrasing. Future work is required to identify a more suitable feature-model combination to improve our prediction performance.
References
[1] J. Robin, et al. 2020, doi: 10.1159/000510820.
[2] M. Deakin, et al., 2022, doi: 10.1136/BMJ.O945.
[3] N. Cummins, et al., 2023, doi: 10.1016/j.jad.2023.08.097.
[4] F. Matcham, et al., 2019, doi: 10.1186/s12888-019-2049-z
[5] Y. Liu et al., 2019, doi: h10.48550/arXiv.1907.11692
[6] A. Radford et al., 2022, doi: 10.48550/arXiv.2212.04356
Original language | English |
---|---|
Number of pages | 1 |
Publication status | Accepted/In press - 12 May 2025 |
Event | UK and Ireland Speech Workshop 2025 - University of York, York, United Kingdom Duration: 16 Jun 2025 → 17 Jun 2025 https://sites.google.com/york.ac.uk/ukis2025/ |
Conference
Conference | UK and Ireland Speech Workshop 2025 |
---|---|
Abbreviated title | UKIS 2025 |
Country/Territory | United Kingdom |
City | York |
Period | 16/06/2025 → 17/06/2025 |
Internet address |