Abstract
This short paper analyses an experimentcomparing the efficacy of several NamedEntity Recognition (NER) tools at extractingentities directly from the output of anoptical character recognition (OCR) work-flow. The authors present how they firstcreated a set of test data, consisting of rawand corrected OCR output manually annotatedwith people, locations, and organizations.They then ran each of the NER toolsagainst both raw and corrected OCR output,comparing the precision, recall, and F1score against the manually annotated data
Original language | English |
---|---|
Title of host publication | Empirical Methods in Natural Language Processing |
Subtitle of host publication | Proceedings of the Conference on Natural Language Processing 2012 |
Editors | Jeremy Jancsary |
Place of Publication | Wien, Austria |
Publisher | Austrian Society for Artificial Intelligence (ÖGAI) |
Pages | 410-414 |
Number of pages | 4 |
Volume | 5 |
ISBN (Electronic) | 3-85027-005-X |
Publication status | Published - 2012 |