King's College London

Research portal

Normalization of Imprecise Temporal Expressions Extracted from Text

Research output: Contribution to journalArticle

Hegler Tissot, Marcos Del Fabro, Leon Derczynski, Angus Roberts

Original languageEnglish
JournalKNOWLEDGE AND INFORMATION SYSTEMS
Publication statusPublished - 15 Feb 2019

Documents

King's Authors

Abstract

Information extraction systems and techniques have been largelyused to deal with the increasing amount of unstructured dataavailable nowadays. Time is amongst the different kinds ofinformation that may be extracted from such unstructured datasources, including text documents. However, the inability tocorrectly identify and extract temporal information from textmakes it difficult to understand how the ex- tracted events areorganised in a chronological order. Furthermore, in manysituations, the meaning of temporal expressions (timexes) isimprecise, such as in “less than 2 years” and “several weeks”,and cannot be accurately normalised, leading to interpre- tationerrors. Although there are some approaches that enablerepresenting imprecise timexes, they are not designed to beapplied to specific scenarios and are difficult togeneralise. This paper presents a novel methodology to analyseand normalise imprecise temporal expressions by representingtemporal imprecision in the form of membership functions, basedon human interpretation of time in two differentlanguages (Portuguese and English). Each resulting model is ageneralisation of probability distributions in the form oftrapezoidal and hexagonal fuzzy membership functions. We use anadapted F1-score to guide the choice of the best models for eachkind of imprecise timex, and a weighted F1-score (F1 3D ) as acomplementary metric in order to identify relevant differ- enceswhen comparing two normalisation models. We apply the proposedmethodology for three distinct classes of imprecise timexes andthe resulting models give distinct insights in the way each kindof temporal expression is interpreted.

Download statistics

No data available

View graph of relations

© 2018 King's College London | Strand | London WC2R 2LS | England | United Kingdom | Tel +44 (0)20 7836 5454