Predicting clinical outcomes in psychotic disorders using electronic case registers and natural language processing

Research output: Contribution to journalMeeting abstractpeer-review

32 Downloads (Pure)


Background: It is not possible to reliably predict clinical outcomes in psychotic disorders. Existing research studies are based on relatively modest sample sizes and may not be representative of everyday clinical practice. Clinical information is widely recorded in the form of electronic health records (EHRs). The majority of useful data are stored in unstructured free text entries. However, the large volume of free text means that it is not feasible to manually read through records to identify data of interest. Automated information extraction methods such as natural language processing (NLP) offer the opportunity to quickly extract and analyse large volumes of meaningful data from free text EHRs. I present a summary of three studies using this approach to investigate clinical outcomes in people with schizophrenia.

Dataset: South London and Maudsley NHS Trust (SLaM) Biomedical Research Centre (BRC) Case Register comprising anonymised EHRs of over 250,000 people.
NLP development: The software package TextHunter was used. All sentences containing keywords relevant to the constructs investigated were extracted using a support vector machine learning (SVM) approach.
Predictor variables: presentation to high-risk clinical services, cannabis use (NLP-derived) and negative symptoms (NLP-derived). Outcomes: number of days spent in hospital, frequency of hospital admission and antipsychotic treatment failure.
Covariates: age, gender, ethnicity, marital status and diagnosis.
Statistical analysis: multivariable logistic, negative binomial, linear regression and mediation analysis using STATA.

(i) Clinical outcomes of FEP in high-risk services (n= 2,943): 164 patients with FEP (5.6%) presented to OASIS, a clinical service in South London for young people with an at-risk-mental-state (ARMS) for psychosis. Presentation to the high-risk service was associated with 17 fewer days spent in hospital (95% CI -33.7, -0.3) and a lower frequency of admission (incidence rate ratio: 0.49, 0.39-0.61) in the 24 months following referral, as compared to patients who presented to conventional services.
(ii) Cannabis and treatment failure in FEP(n= 2,026): Cannabis use was present in 46.3% of people with FEP. It was associated with increased frequency of hospital admission (incidence rate ratio 1.50, 1.25-1.80) and greater number of days spent in hospital (B coefficient 35.1 days, 12.1-58.1). An increase in the number of unique antipsychotics prescribed to cannabis users mediated an increased frequency of hospital admission (natural indirect effect: 1.11, 1.04-1.17; total effect: 1.41, 1.22-1.64) and greater number of days spent in hospital (NIE: 16.1, 6.7-25.5; TE: 19.9, 2.5-37.3).
(iii) Negative symptoms and clinical outcomes in chronic schizophrenia (n= 7,678): 55.7% of people with schizophrenia had at least one negative symptom documented. Negative symptoms were associated with increased likelihood of hospital admission (odds ratio 1.24, 95% CI 1.10-1.39), re-admission (1.58, 1.28-1.95) and length of stay (B coefficient 20.5, 7.6-33.5).

Discussion: It was possible to use EHR data extracted using NLP to investigate associations with clinical outcomes of psychosis in large sample sizes which would otherwise have been unfeasible to investigate using direct patient recruitment. These findings are important for mental healthcare services as they suggest that early detection of psychosis in high-risk services may be associated with better outcomes, and that greater attention should be given to cannabis use and negative symptoms in people with established psychotic disorders. The NLP tools developed in these studies also have the potential to support real-time clinical decision making at an individual patient level.
Original languageEnglish
Article number16007
Pages (from-to)T139
Publication statusAccepted/In press - 5 Apr 2016


Dive into the research topics of 'Predicting clinical outcomes in psychotic disorders using electronic case registers and natural language processing'. Together they form a unique fingerprint.

Cite this