King's College London

Research portal

PubHealthTab: A Public Health Table-based Dataset for Evidence-based Fact Checking

Research output: Chapter in Book/Report/Conference proceedingConference paperpeer-review

Original languageEnglish
Title of host publicationFindings of the Association for Computational Linguistics
Subtitle of host publicationNAACL 2022 - Findings
PublisherAssociation for Computational Linguistics (ACL)
Pages1-16
Number of pages16
ISBN (Electronic)9781955917766
Published2022
Event2022 Findings of the Association for Computational Linguistics: NAACL 2022 - Seattle, United States
Duration: 10 Jul 202215 Jul 2022

Publication series

NameFindings of the Association for Computational Linguistics: NAACL 2022 - Findings

Conference

Conference2022 Findings of the Association for Computational Linguistics: NAACL 2022
Country/TerritoryUnited States
CitySeattle
Period10/07/202215/07/2022

Bibliographical note

Funding Information: The authors acknowledge support from the Distributed AI (DAI) research group at King’s College London for creating the dataset. Publisher Copyright: © Findings of the Association for Computational Linguistics: NAACL 2022 - Findings.

King's Authors

Abstract

Inspired by human fact checkers, who use different types of evidence (e.g. tables, images, audio) in addition to text, several datasets with tabular evidence data have been released in recent years. Whilst the datasets encourage research on table fact-checking, they rely on information from restricted data sources, such as Wikipedia for creating claims and extracting evidence data, making the fact-checking process different from the real-world process used by fact checkers. In this paper, we introduce PubHealthTab, a table fact-checking dataset based on real-world public health claims and noisy evidence tables from sources similar to those used by real fact checkers. We outline our approach for collecting evidence data from various websites and present an in-depth analysis of our dataset. Finally, we evaluate state-of-theart table representation and pre-trained models fine-tuned on our dataset, achieving an overall F1 score of 0.73.

View graph of relations

© 2020 King's College London | Strand | London WC2R 2LS | England | United Kingdom | Tel +44 (0)20 7836 5454