Towards the domain agnostic generation of natural language explanations from provenance graphs for casual users

Darren Richardson, Luc Moreau

Research output: Chapter in Book/Report/Conference proceedingOther chapter contributionpeer-review

7 Citations (Scopus)
51 Downloads (Pure)


As more systems become PROV-enabled, there will be a cor- responding increase in the need to communicate provenance data directly to users. Whilst there are a number of existing methods for doing this -- formally, diagrammatically, and textually -- there are currently no application-generic techniques for generating linguistic explanations of provenance. The principal reason for this is that a certain amount of linguistic information is required to transform a provenance graph -- such as in PROV -- into a textual explanation, and if this information is not available as an annotation, this transformation is presently not possible. In this paper, we describe how we have adapted the common ?consensus? architecture from the field of natural language generation to achieve this graph transformation, resulting in the novel PROVglish architecture. We then present an approach to garnering the necessary linguistic information from a PROV dataset, which involves exploiting the linguistic information informally encoded in the URIs denoting provenance resources. We finish by detailing an evaluation undertaken to assess the effectiveness of this approach to lexicalisation, demonstrating a significant improvement in terms of fluency, comprehensibility, and grammatical correctness.
Original languageEnglish
Title of host publicationIPAW'2016: 6th International Provenance and Annotation Workshop
Place of PublicationMcLean, VA, US
PublisherSpringer Berlin Heidelberg
Number of pages12
Publication statusPublished - 1 Jun 2016


Dive into the research topics of 'Towards the domain agnostic generation of natural language explanations from provenance graphs for casual users'. Together they form a unique fingerprint.

Cite this