From Detection to Explanation: Effective Learning Strategies for LLMs in Online Abusive Language Research

Research output: Chapter in Book/Report/Conference proceedingConference paperpeer-review

86 Downloads (Pure)

Abstract

Abusive language detection relies on understanding different levels of intensity, expressiveness and targeted groups, which requires commonsense reasoning, world knowledge and linguistic nuances that evolve over time. Here, we frame the problem as a knowledge-guided learning task, and demonstrate that LLMs' implicit knowledge without an accurate strategy is not suitable for multi-class detection nor explanation generation. We publicly release GLlama Alarm, the knowledge-Guided version of Llama-2 instruction fine-tuned for multi-class abusive language detection and explanation generation. By being fine-tuned on structured explanations and external reliable knowledge sources, our model mitigates bias and generates explanations that are relevant to the text and coherent with human reasoning, with an average 48.76% better alignment with human judgment according to our expert survey.
Original languageEnglish
Title of host publicationProceedings of the 2025 International Conference on Computational Linguistics (COLING 2025)
Place of PublicationAbu Dhabi
Publication statusPublished - Jan 2025

Keywords

  • hate speech
  • Large Language Models
  • text classification
  • text generation

Fingerprint

Dive into the research topics of 'From Detection to Explanation: Effective Learning Strategies for LLMs in Online Abusive Language Research'. Together they form a unique fingerprint.

Cite this