TY - CHAP
T1 - The Fact Extraction and VERification Over Unstructured and Structured information (FEVEROUS) Shared Task
AU - Aly, Rami
AU - Guo, Zhijiang
AU - Schlichtkrull, Michael
AU - Thorne, James
AU - Vlachos, Andreas
AU - Christodoulopoulos, Christos
AU - Cocarascu, Oana
AU - Mittal, Arpit
N1 - Funding Information:
We would like to thank Amazon for sponsoring the dataset generation and supporting the FEVER workshop and the FEVEROUS shared task. Rami Aly is supported by the Engineering and Physical Sciences Research Council Doctoral Training Partnership (EPSRC). James Thorne is supported by an Amazon Alexa Graduate Research Fellowship. Zhijiang Guo, Michael Schlichtkrull and Andreas Vlachos are supported by the ERC grant AVeriTeC (GA 865958).
Publisher Copyright:
© 2021 Association for Computational Linguistics.
PY - 2021
Y1 - 2021
N2 - The Fact Extraction and VERification Over Unstructured and Structured information (FEVEROUS) shared task, asks participating systems to determine whether human-authored claims are SUPPORTED or REFUTED based on evidence retrieved from Wikipedia (or NOTENOUGHINFO if the claim cannot be verified). Compared to the FEVER 2018 shared task, the main challenge is the addition of structured data (tables and lists) as a source of evidence. The claims in the FEVEROUS dataset can be verified using only structured evidence, only unstructured evidence, or a mixture of both. Submissions are evaluated using the FEVEROUS score that combines label accuracy and evidence retrieval. Unlike FEVER 2018 (Thorne et al., 2018a), FEVEROUS requires partial evidence to be returned for NOTENOUGHINFO claims, and the claims are longer and thus more complex. The shared task received 13 entries, six of which were able to beat the baseline system. The winning team was “Bust a move!”, achieving a FEVEROUS score of 27% (+9% compared to the baseline). In this paper we describe the shared task, present the full results and highlight commonalities and innovations among the participating systems.
AB - The Fact Extraction and VERification Over Unstructured and Structured information (FEVEROUS) shared task, asks participating systems to determine whether human-authored claims are SUPPORTED or REFUTED based on evidence retrieved from Wikipedia (or NOTENOUGHINFO if the claim cannot be verified). Compared to the FEVER 2018 shared task, the main challenge is the addition of structured data (tables and lists) as a source of evidence. The claims in the FEVEROUS dataset can be verified using only structured evidence, only unstructured evidence, or a mixture of both. Submissions are evaluated using the FEVEROUS score that combines label accuracy and evidence retrieval. Unlike FEVER 2018 (Thorne et al., 2018a), FEVEROUS requires partial evidence to be returned for NOTENOUGHINFO claims, and the claims are longer and thus more complex. The shared task received 13 entries, six of which were able to beat the baseline system. The winning team was “Bust a move!”, achieving a FEVEROUS score of 27% (+9% compared to the baseline). In this paper we describe the shared task, present the full results and highlight commonalities and innovations among the participating systems.
UR - http://www.scopus.com/inward/record.url?scp=85123469969&partnerID=8YFLogxK
M3 - Conference paper
AN - SCOPUS:85123469969
T3 - FEVER 2021 - Fact Extraction and VERification, Proceedings of the 4th Workshop
SP - 1
EP - 13
BT - FEVER 2021 - Fact Extraction and VERification, Proceedings of the 4th Workshop
A2 - Aly, Rami
A2 - Christodoulopoulos, Christos
A2 - Cocarascu, Oana
A2 - Guo, Zhijiang
A2 - Mittal, Arpit
A2 - Schlichtkrull, Michael
A2 - Thorne, James
A2 - Vlachos, Andreas
PB - Association for Computational Linguistics (ACL)
T2 - 4th Workshop on Fact Extraction and VERification, FEVER 2021
Y2 - 10 November 2021
ER -