Longitudinal changes in cognitive impairment for patients with Schizophrenia

Student thesis: Doctoral ThesisDoctor of Philosophy


This thesis examines the longitudinal changes in cognition for patients with schizophrenia using free text from medical records. To this end, we introduce a unified framework to extract, annotate, classify and analyse cognitive impairments from unstructured text, evaluate symptom trajectories and measure their association with socio-demographic factors and clinical outcomes. The framework was further extended to any type of symptom that can be defined using a list of keywords, and allows easy implementation and deployment within the Clinical Record Interactive Search (CRIS) system, which provides researchers with regulated and secure access to anonymised information from clinical records.

A standardized approach to extract and annotate portions of unstructured text relevant to cognitive impairments was developed in conjunction with clinicians and researchers. This annotated dataset was then used to train text classification algorithms, in order to separate affirmed versus irrelevant or negated mentions of cognitive symptoms. An extensive comparative study, looking at existing text classification methods within the biomedical as well as general domains was conducted on both public and internal datasets. The results showed that transformer-based approaches, which are the current state of the art for many natural language processing tasks, outperform other methods in terms of accuracy, ease of implementation and scalability, particularly when trained on a combination of general and medical data. This text classification model was subsequently used to derive cognitive score time series from the free text of medical records. This “digital signature” of cognitive changes was in turn validated against scores obtained from clinically administered tests, confirming the accuracy and reliability of the model.

Symptom trajectories were then evaluated using mixed linear models, again comparing the results obtained with the transformer model against standardized instruments. Both approaches demonstrated similar rates of change, indicating a gradual cognitive decline with age, which is attenuated by certain socio-demographic factors such as education, employment or marital status. The transformer-based model highlighted a strong association between education and cognition, showing that certain cognitive impairments, specifically attention and social cognition, were more likely to be reported early for patients with a higher education level. The relationship between cognition and clinical outcomes was also analysed, indicating that cognitive problems are correlated with adverse outcomes. This supports the findings in the literature that these symptoms account for much of the disability associated with schizophrenia.

Finally, the text classification framework was tested and generalized to cover other symptoms and patient groups, allowing the development of a standardized set of tools that were then deployed within health research settings. This formed the basis for other research, notably COVID-related projects, which involved extracting mentions of anxiety and violent behaviour from the free text of clinical records, paving the way to further clinical applications.

The contribution of this research is both methodological and practical. The use of a novel symptom extraction, classification and analysis framework demonstrates that cognitive impairments can be reliably harvested from the free text of medical records using deep learning models. The framework shows that these impairments are common in patients with schizophrenia and are correlated with adverse clinical outcomes. It provides a scalable and adaptable means of conducting research using large, unstructured datasets, typical of the vast amount of data routinely collected in clinical records.Such automated tools can be utilized to detect early impairments, screen individuals and identify those who would benefit from more comprehensive assessments, and ultimately support real-time clinical decision making.
Date of Award1 May 2022
Original languageEnglish
Awarding Institution
  • King's College London
SupervisorAngus Roberts (Supervisor), Robert Stewart (Supervisor) & Richard Dobson (Supervisor)

Cite this