King's College London

Research portal

Semantic computational analysis of anticoagulation use in atrial fibrillation from real world data

Research output: Contribution to journalArticlepeer-review

Original languageEnglish
Article numbere0225625
JournalPLOS One
Issue number11
Accepted/In press2019
Published25 Nov 2019


King's Authors


Atrial fibrillation (AF) is the most common arrhythmia and significantly increases stroke risk. This risk is effectively managed by oral anticoagulation. Recent studies using national registry data indicate increased use of anticoagulation resulting from changes in guidelines and the availability of newer drugs. The aim of this study is to develop and validate an open source risk scoring pipeline for free-text electronic health record data using natural language processing. AF patients discharged from 1 st January 2011 to 1 st October 2017 were identified from discharge summaries (N = 10,030, 64.6% male, average age 75.3 ± 12.3 years). A natural language processing pipeline was developed to identify risk factors in clinical text and calculate risk for ischaemic stroke (CHA 2DS 2-VASc) and bleeding (HAS-BLED). Scores were validated vs two independent experts for 40 patients. Automatic risk scores were in strong agreement with the two independent experts for CHA 2DS 2-VASc (average kappa 0.78 vs experts, compared to 0.85 between experts). Agreement was lower for HAS-BLED (average kappa 0.54 vs experts, compared to 0.74 between experts). In high-risk patients (CHA 2DS 2-VASc ≥2) OAC use has increased significantly over the last 7 years, driven by the availability of DOACs and the transitioning of patients from AP medication alone to OAC. Factors independently associated with OAC use included components of the CHA 2DS 2-VASc and HAS-BLED scores as well as discharging specialty and frailty. OAC use was highest in patients discharged under cardiology (69%). Electronic health record text can be used for automatic calculation of clinical risk scores at scale. Open source tools are available today for this task but require further validation. Analysis of routinely collected EHR data can replicate findings from large-scale curated registries.

Download statistics

No data available

View graph of relations

© 2020 King's College London | Strand | London WC2R 2LS | England | United Kingdom | Tel +44 (0)20 7836 5454