TY - UNPB
T1 - Hierarchical Interpretation of Neural Text Classification
AU - Yan, Hanqi
AU - Gui, Lin
AU - He, Yulan
N1 - Funding Information:
This work was funded by the UK Engineering and Physical Sciences Research Council (grant no. EP/T017112/1, EP/V048597/1, EP/X019063/1). Hanqi Yan receives the PhD scholarship funded jointly by the University of Warwick and the Chinese Scholarship Council. Yulan He is supported by a Turing AI Fellowship funded by the UK Research and Innovation (grant no. EP/V020579/1).
Publisher Copyright:
© 2022 Association for Computational Linguistics.
PY - 2022/2/20
Y1 - 2022/2/20
N2 - Recent years have witnessed increasing interest in developing interpretable models in Natural Language Processing (NLP). Most existing models aim at identifying input features such as words or phrases important for model predictions. Neural models developed in NLP, however, often compose word semantics in a hierarchical manner. As such, interpretation by words or phrases only cannot faithfully explain model decisions in text classification. This article proposes a novel Hierarchical Interpretable Neural Text classifier, called HINT, which can automatically generate explanations of model predictions in the form of label-associated topics in a hierarchical manner. Model interpretation is no longer at the word level, but built on topics as the basic semantic unit. Experimental results on both review datasets and news datasets show that our proposed approach achieves text classification results on par with existing state-of-the-art text classifiers, and generates interpretations more faithful to model predictions and better understood by humans than other interpretable neural text classifiers.
1.
AB - Recent years have witnessed increasing interest in developing interpretable models in Natural Language Processing (NLP). Most existing models aim at identifying input features such as words or phrases important for model predictions. Neural models developed in NLP, however, often compose word semantics in a hierarchical manner. As such, interpretation by words or phrases only cannot faithfully explain model decisions in text classification. This article proposes a novel Hierarchical Interpretable Neural Text classifier, called HINT, which can automatically generate explanations of model predictions in the form of label-associated topics in a hierarchical manner. Model interpretation is no longer at the word level, but built on topics as the basic semantic unit. Experimental results on both review datasets and news datasets show that our proposed approach achieves text classification results on par with existing state-of-the-art text classifiers, and generates interpretations more faithful to model predictions and better understood by humans than other interpretable neural text classifiers.
1.
KW - cs.CL
KW - cs.AI
KW - cs.IR
UR - http://www.scopus.com/inward/record.url?scp=85143255001&partnerID=8YFLogxK
U2 - https://doi.org/10.1162/coli_a_00459
DO - https://doi.org/10.1162/coli_a_00459
M3 - Preprint
VL - 48
T3 - Computational Linguistics
SP - 987
EP - 1020
BT - Hierarchical Interpretation of Neural Text Classification
PB - MIT Press
CY - Computational Linguistics
ER -