The Impact of Data Augmentation on Sentiment Analysis of Translated Textual Data

Thuraya Omran, Baraa Sharef, Crina Grosan, Yongmin Li

Research output: Chapter in Book/Report/Conference proceedingConference paperpeer-review

2 Citations (Scopus)

Abstract

Sentiment analysis is an application of natural language processing that requires an abundance of data that may not be achieved sometimes for some reason. Data augmentation is one technique that deals with the lack of data by creating synthetic training data without adding new ones. It boosts model performance, especially with deep learning ones. Despite its influential role in boosting the model performance, it attracted very little attention from the researchers of the Arabic NLP community, specifically with scarce language resources such as Arabic and its dialects. In this study, one of the augmentation techniques called random swap was applied with LSTM deep learning model to classify three parallel datasets. The three parallel datasets are Bahraini dialects, Modern Standard Arabic and English. The results show an improvement in the LSTM model by 14.06%, 12.57%, and 11.04% on Bahraini dialects, Modern Standard Arabic, and English datasets, respectively, when applying the augmentation technique over that of no application.

Original languageEnglish
Title of host publication2023 International Conference on IT Innovation and Knowledge Discovery, ITIKD 2023
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9781665463720
DOIs
Publication statusPublished - 2023
Event2023 International Conference on IT Innovation and Knowledge Discovery, ITIKD 2023 - Manama, Bahrain
Duration: 8 Mar 20239 Mar 2023

Publication series

Name2023 International Conference on IT Innovation and Knowledge Discovery, ITIKD 2023

Conference

Conference2023 International Conference on IT Innovation and Knowledge Discovery, ITIKD 2023
Country/TerritoryBahrain
CityManama
Period8/03/20239/03/2023

Keywords

  • Bahraini dialects
  • Data augmentation
  • LSTM
  • Modern standard Arabic
  • translation-based

Fingerprint

Dive into the research topics of 'The Impact of Data Augmentation on Sentiment Analysis of Translated Textual Data'. Together they form a unique fingerprint.

Cite this