Cognitive Impairments in Schizophrenia: A Study in a Large Clinical Sample Using Natural Language Processing

Aurelie Mascio*, Robert Stewart, Riley Botelle, Marcus Williams, Luwaiza Mirza, Rashmi Patel, Thomas Pollak, Richard Dobson, Angus Roberts

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

5 Citations (Scopus)


Background: Cognitive impairments are a neglected aspect of schizophrenia despite being a major factor of poor functional outcome. They are usually measured using various rating scales, however, these necessitate trained practitioners and are rarely routinely applied in clinical settings. Recent advances in natural language processing techniques allow us to extract such information from unstructured portions of text at a large scale and in a cost effective manner. We aimed to identify cognitive problems in the clinical records of a large sample of patients with schizophrenia, and assess their association with clinical outcomes. Methods: We developed a natural language processing based application identifying cognitive dysfunctions from the free text of medical records, and assessed its performance against a rating scale widely used in the United Kingdom, the cognitive component of the Health of the Nation Outcome Scales (HoNOS). Furthermore, we analyzed cognitive trajectories over the course of patient treatment, and evaluated their relationship with various socio-demographic factors and clinical outcomes. Results: We found a high prevalence of cognitive impairments in patients with schizophrenia, and a strong correlation with several socio-demographic factors (gender, education, ethnicity, marital status, and employment) as well as adverse clinical outcomes. Results obtained from the free text were broadly in line with those obtained using the HoNOS subscale, and shed light on additional associations, notably related to attention and social impairments for patients with higher education. Conclusions: Our findings demonstrate that cognitive problems are common in patients with schizophrenia, can be reliably extracted from clinical records using natural language processing, and are associated with adverse clinical outcomes. Harvesting the free text from medical records provides a larger coverage in contrast to neurocognitive batteries or rating scales, and access to additional socio-demographic and clinical variables. Text mining tools can therefore facilitate large scale patient screening and early symptoms detection, and ultimately help inform clinical decisions.

Original languageEnglish
Article number711941
JournalFrontiers in digital health
Publication statusPublished - 15 Jul 2021


  • cognition
  • data mining
  • electronic health records
  • natural language processing
  • schizophrenia


Dive into the research topics of 'Cognitive Impairments in Schizophrenia: A Study in a Large Clinical Sample Using Natural Language Processing'. Together they form a unique fingerprint.

Cite this