Uncertainty Quantification for Text Classification

Dell Zhang, Murat Sensoy, Masoud Makrehchi, Bilyana Taneva-Popova, Lin Gui, Yulan He

Research output: Chapter in Book/Report/Conference proceedingChapterpeer-review

4 Citations (Scopus)

Abstract

This full-day tutorial introduces modern techniques for practical uncertainty quantification specifically in the context of multi-class and multi-label text classification. First, we explain the usefulness of estimating aleatoric uncertainty and epistemic uncertainty for text classification models. Then, we describe several state-of-the-art approaches to uncertainty quantification and analyze their scalability to big text data: Virtual Ensemble in GBDT, Bayesian Deep Learning (including Deep Ensemble, Monte-Carlo Dropout, Bayes by Backprop, and their generalization Epistemic Neural Networks), Evidential Deep Learning (including Prior Networks and Posterior Networks), as well as Distance Awareness (including Spectral-normalized Neural Gaussian Process and Deep Deterministic Uncertainty). Next, we talk about the latest advances in uncertainty quantification for pre-trained language models (including asking language models to express their uncertainty, interpreting uncertainties of text classifiers built on large-scale language models, uncertainty estimation in text generation, calibration of language models, and calibration for in-context learning). After that, we discuss typical application scenarios of uncertainty quantification in text classification (including in-domain calibration, cross-domain robustness, and novel class detection). Finally, we list popular performance metrics for the evaluation of uncertainty quantification effectiveness in text classification. Practical hands-on examples/exercises are provided to the attendees for them to experiment with different uncertainty quantification methods on a few real-world text classification datasets such as CLINC150.

Original languageEnglish
Title of host publicationSIGIR '23
Subtitle of host publicationProceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval
PublisherACM
Pages3426-3429
Number of pages4
ISBN (Print)9781450394086
DOIs
Publication statusPublished - 18 Jul 2023
Event46th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2023 - Taipei, Taiwan
Duration: 23 Jul 202327 Jul 2023

Conference

Conference46th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2023
Country/TerritoryTaiwan
CityTaipei
Period23/07/202327/07/2023

Keywords

  • language models
  • text classification
  • uncertainty quantification

Fingerprint

Dive into the research topics of 'Uncertainty Quantification for Text Classification'. Together they form a unique fingerprint.

Cite this