research-article

Arabic Aspect Category Detection for Hotel Reviews based on Data Augmentation and Classifier Chains

Authors:

Sadok Ben YahiaAuthors Info & Claims

SAC '23: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing

Pages 942 - 949

https://doi.org/10.1145/3555776.3577746

Published: 07 June 2023 Publication History

Abstract

Recently, the amount of content generated on online hospitality platforms has increased exponentially and has changed people's ways of life. Consumers often refer to online reviews before deciding which hotel to choose. These reviews provide firsthand information, essential to improving hotel services' quality. However, the massive amount of review data and its unstructured nature make it a difficult challenge. Indeed, many researchers were interested in exploring the field of sentiment analysis in the hotel industry. In particular, they have given more attention to aspect-based sentiment analysis, which categorizes opinions by aspect and identifies the sentiment related to each aspect. However, studies examining the Arabic language are limited compared to English. Our paper aims to explore aspect category detection as a sub-task of aspect-based sentiment analysis using Arabic reviews. We relied on the SemEval-2016 Arabic dataset for hotel reviews. As this data suffers from an imbalanced distribution, we propose an approach for multi-label data augmentation of the minority classes in this used dataset. Then, we propose a specific preprocessing for this Arabic reviews dataset. Our aspect category prediction approach is based on the classifier chains technique. In fact, unlike previous works that treat each label separately, we handle the dependencies between the various labels. Our findings show that our proposed approach achieves a good F₁ score that outperforms the pioneering related work approaches.

References

[1]

Ahmed Abdelali, Kareem Darwish, Nadir Durrani, and Hamdy Mubarak. 2016. Farasa: A fast and furious segmenter for arabic. In Proc. of the 2016 conference of the North American chapter of the association for computational linguistics: Demonstrations.

[2]

Saja Al-Dabet, Sara Tedmori, and AL-Smadi Mohammad. 2021. Enhancing Arabic aspect-based sentiment analysis using deep learning models. Computer Speech Language (2021).

[3]

Mohammad Al-Smadi, Omar Qawasmeh, Mahmoud Al-Ayyoub. Yaser Jararweh, and Brij Gupta. 2018. Deep Recurrent neural network vs. support vector machine for aspect-based sentiment analysis of Arabic hotels' reviews. Journal of computational science 27 (2018), 386--393.

[4]

Eiman Alsharhan and Allan Ramsay. 2020. Investigating the effects of gender, dialect, and training size on the performance of Arabic speech recognition. Language Resources and Evaluation (2020).

[5]

Wissam Antoun, Fady Baly, and Hazem Hajj. 2020. Arabert: Transformer-based model for Arabic language understanding. arXiv preprint arXiv:2003.00104 (2020).

[6]

Francisco Charte, Antonio J Rivera, María J del Jesus, and Francisco Herrera. 2015. MLSMOTE: Approaching imbalanced multilabel learning through synthetic instance generation. Knowledge-Based Systems (2015).

[7]

Nitesh V Chawla, Kevin W Bowyer, Lawrence O Hall, and W Philip Kegelmeyer. 2002. SMOTE: synthetic minority over-sampling technique. Journal of artificial intelligence research (2002).

Digital Library

[8]

Krzysztof Dembczynski, Weiwei Cheng, and Eyke Hüllermeier. 2010. Bayes optimal multilabel classification via probabilistic classifier chains. In ICML.

[9]

Pedro Gonnet and Thomas Deselaers. 2020. Indylstms: Independently Recurrent LSTMS. In International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[10]

Sana Hamdi, Ahmed Hamdi, and Sadok Ben Yahia. 2022. BERT and Word Embedding for Interest Mining of Instagram Users. In Advances in Computational Collective Intelligence. Springer International Publishing, Cham, 123--136.

[11]

Mai Ibrahim, Marwan Torki, and Nagwa El-Makky. 2018. Imbalanced toxic comments classification using data augmentation and deep learning. In 2018 17th IEEE international conference on machine learning and applications (ICMLA). IEEE, 875--878.

[12]

Mai Ibrahim, Marwan Torki, and Nagwa M El-Makky. 2020. AlexU-BackTranslation-TL at SemEval-2020 Task 12: Improving offensive language detection using data augmentation and transfer learning. In Proc. of the Fourteenth Workshop on Semantic Evaluation. 1881--1890.

[13]

Tomas Liesting, Flavius Frasincar, and Maria Mihaela Truşcă. 2021. Data augmentation in a hybrid approach for aspect-based sentiment analysis. In Proceedings of the 36th Annual ACM Symposium on Applied Computing. 828--835.

Digital Library

[14]

Maria Pontiki, Dimitrios Galanis, Haris Papageorgiou, Ion Androutsopoulos, Suresh Manandhar, Mohammad Al-Smadi, Mahmoud Al-Ayyoub, Yanyan Zhao, Bing Qin, Orphée De Clercq, et al. 2016. Semeval-2016 task 5: Aspect based sentiment analysis. In International workshop on semantic evaluation. 19--30.

[15]

M Pontiki, D Galanis, H Papageorgiou, S Manandhar, and I Androutsopoulos. 2016. SemEval 2016 task 5: aspect based sentiment analysis (ABSA-16) annotation guidelines. (2016).

[16]

Jesse Read, Bernhard Pfahringer, Geoff Holmes, and Eibe Frank. 2011. Classifier chains for multi-label classification. Machine learning (2011).

[17]

Sebastian Ruder, Parsa Ghaffari, and John G Breslin. 2016. Insight-1 at semeval-2016 task 5: Deep learning for multilingual aspect-based sentiment analysis. (2016).

[18]

Abu Bakr Soliman, Kareem Eissa, and Samhaa R El-Beltagy. 2017. Aravec: A set of Arabic word embedding models for use in Arabic nlp. Procedia Computer Science 117 (2017), 256--265.

[19]

Aleš Tamchyna and Kateŕina Veselovská. 2016. Ufal at semeval-2016 task 5: recurrent neural networks for sentence classification. In Proc. of the international workshop on semantic evaluation.

[20]

Maria Mihaela Truşcă and Flavius Frasincar. 2022. Survey on aspect detection for aspect-based sentiment analysis. Artificial Intelligence Review (2022), 1--50.

Cited By

ElSabagh AAzab SHefny H(2025)A comprehensive survey on Arabic text augmentation: approaches, challenges, and applicationsNeural Computing and Applications10.1007/s00521-025-11020-zOnline publication date: 7-Feb-2025
https://doi.org/10.1007/s00521-025-11020-z
Youssef LElhoussaine Z(2024)Arabic Aspect Category Detection Using Traditional Neural Networks and Arbert2024 11th International Conference on Wireless Networks and Mobile Communications (WINCOM)10.1109/WINCOM62286.2024.10657409(1-6)Online publication date: 23-Jul-2024
https://doi.org/10.1109/WINCOM62286.2024.10657409
Német BRigó ASárdy M(2023)A bőrgyógyászati páciensek komplex pszichodermatológiai ellátásának szükségességeMentálhigiéné és Pszichoszomatika10.1556/0406.2023.0004424:4(307-317)Online publication date: 26-Dec-2023
https://doi.org/10.1556/0406.2023.00044
Show More Cited By

Index Terms

Arabic Aspect Category Detection for Hotel Reviews based on Data Augmentation and Classifier Chains
1. Computer systems organization
  1. Dependable and fault-tolerant systems and networks
    1. Redundancy
  2. Embedded and cyber-physical systems
    1. Embedded systems
    2. Robotics
2. Networks
  1. Network properties
    1. Network reliability

Recommendations

Aspect based Sentiment Oriented Summarization of Hotel Reviews

Hotel booking websites use online ratings and customer feedback to help the customers decision making process but reviews provide a better insight about the hotel but most travellers dont have the time or patience to read all reviews. This study ...
Optimization of classifier chains via conditional likelihood maximization

A general framework is proposed for multi-label classification from the viewpoint of conditional likelihood maximization.Based on the proposed framework, the popular classifier chains method is optimized in terms of label correlation modeling and multi-...
Classifier chains for positive unlabelled multi-label learning
Abstract
In traditional multi-label setting it is assumed that all relevant labels are assigned to the given instance. In positive unlabelled setting, only some of relevant labels are assigned. The appearance of a label means that the instance ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

SAC '23: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing

March 2023

1932 pages

ISBN:9781450395175

DOI:10.1145/3555776

Conference Chairs:
Jiman Hong
Soongsil University, South Korea
,
Maart Lanperne
Tallinn University, Estonia
,
Program Chairs:
Juw Won Park
University of Louisville, USA
,
Tomas Cerny
Baylor University, USA
,
Publication Chair:
Hossain Shahriar
Kennesaw State University, USA

Copyright © 2023 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

SIGAPP: ACM Special Interest Group on Applied Computing

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 07 June 2023

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

SAC '23

Sponsor:

SIGAPP

SAC '23: 38th ACM/SIGAPP Symposium on Applied Computing

March 27 - 31, 2023

Tallinn, Estonia

Acceptance Rates

Overall Acceptance Rate 1,650 of 6,669 submissions, 25%

Upcoming Conference

SAC '25

Sponsor:
sigapp

The 40th ACM/SIGAPP Symposium on Applied Computing

March 31 - April 4, 2025

Catania , Italy

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

5
Total Citations
View Citations
54
Total Downloads

Downloads (Last 12 months)14
Downloads (Last 6 weeks)1

Reflects downloads up to 16 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

ElSabagh AAzab SHefny H(2025)A comprehensive survey on Arabic text augmentation: approaches, challenges, and applicationsNeural Computing and Applications10.1007/s00521-025-11020-zOnline publication date: 7-Feb-2025
https://doi.org/10.1007/s00521-025-11020-z
Youssef LElhoussaine Z(2024)Arabic Aspect Category Detection Using Traditional Neural Networks and Arbert2024 11th International Conference on Wireless Networks and Mobile Communications (WINCOM)10.1109/WINCOM62286.2024.10657409(1-6)Online publication date: 23-Jul-2024
https://doi.org/10.1109/WINCOM62286.2024.10657409
Német BRigó ASárdy M(2023)A bőrgyógyászati páciensek komplex pszichodermatológiai ellátásának szükségességeMentálhigiéné és Pszichoszomatika10.1556/0406.2023.0004424:4(307-317)Online publication date: 26-Dec-2023
https://doi.org/10.1556/0406.2023.00044
Ameur AHamdi SBen Yahia S(2023)Sentiment Analysis for Hotel Reviews: A Systematic Literature ReviewACM Computing Surveys10.1145/360515256:2(1-38)Online publication date: 15-Sep-2023
https://dl.acm.org/doi/10.1145/3605152
Ameur AHamdi SYahia S(2023)Enhanced approach of multilabel learning for the Arabic aspect category detection of the hotel reviewsComputational Intelligence10.1111/coin.1260940:1Online publication date: 14-Nov-2023
https://doi.org/10.1111/coin.12609

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten