Comparison of Named Entity Recognition tools for raw OCR text

K.J. Rodriquez, Michael Bryant, T. Blanke, M. Luszczynska

Research output: Chapter in Book/Report/Conference proceedingConference paper

46 Citations (Scopus)


This short paper analyses an experimentcomparing the efficacy of several NamedEntity Recognition (NER) tools at extractingentities directly from the output of anoptical character recognition (OCR) work-flow. The authors present how they firstcreated a set of test data, consisting of rawand corrected OCR output manually annotatedwith people, locations, and organizations.They then ran each of the NER toolsagainst both raw and corrected OCR output,comparing the precision, recall, and F1score against the manually annotated data
Original languageEnglish
Title of host publicationEmpirical Methods in Natural Language Processing
Subtitle of host publicationProceedings of the Conference on Natural Language Processing 2012
EditorsJeremy Jancsary
Place of PublicationWien, Austria
PublisherAustrian Society for Artificial Intelligence (ÖGAI)
Number of pages4
ISBN (Electronic)3-85027-005-X
Publication statusPublished - 2012


Dive into the research topics of 'Comparison of Named Entity Recognition tools for raw OCR text'. Together they form a unique fingerprint.

Cite this