TextHunter--A User Friendly Tool for Extracting Generic Concepts from Free Text in Clinical Research

Richard G. Jackson MSc, Michael Ball, Rashmi Patel, Richard D. Hayes, Richard J B Dobson, Robert Stewart

Research output: Contribution to journalArticlepeer-review

33 Citations (Scopus)

Abstract

Observational research using data from electronic health records (EHR) is a rapidly growing area, which promises both increased sample size and data richness - therefore unprecedented study power. However, in many medical domains, large amounts of potentially valuable data are contained within the free text clinical narrative. Manually reviewing free text to obtain desired information is an inefficient use of researcher time and skill. Previous work has demonstrated the feasibility of applying Natural Language Processing (NLP) to extract information. However, in real world research environments, the demand for NLP skills outweighs supply, creating a bottleneck in the secondary exploitation of the EHR. To address this, we present TextHunter, a tool for the creation of training data, construction of concept extraction machine learning models and their application to documents. Using confidence thresholds to ensure high precision (>90%), we achieved recall measurements as high as 99% in real world use cases.

Original languageEnglish
Pages (from-to)729-738
Number of pages10
JournalProceedings of the American Medical Informatics Association
Volume2014
Publication statusPublished - 2014

Fingerprint

Dive into the research topics of 'TextHunter--A User Friendly Tool for Extracting Generic Concepts from Free Text in Clinical Research'. Together they form a unique fingerprint.

Cite this