skip to main content
10.1145/3555776.3577746acmconferencesArticle/Chapter ViewAbstractPublication PagessacConference Proceedingsconference-collections
research-article

Arabic Aspect Category Detection for Hotel Reviews based on Data Augmentation and Classifier Chains

Published: 07 June 2023 Publication History

Abstract

Recently, the amount of content generated on online hospitality platforms has increased exponentially and has changed people's ways of life. Consumers often refer to online reviews before deciding which hotel to choose. These reviews provide firsthand information, essential to improving hotel services' quality. However, the massive amount of review data and its unstructured nature make it a difficult challenge. Indeed, many researchers were interested in exploring the field of sentiment analysis in the hotel industry. In particular, they have given more attention to aspect-based sentiment analysis, which categorizes opinions by aspect and identifies the sentiment related to each aspect. However, studies examining the Arabic language are limited compared to English. Our paper aims to explore aspect category detection as a sub-task of aspect-based sentiment analysis using Arabic reviews. We relied on the SemEval-2016 Arabic dataset for hotel reviews. As this data suffers from an imbalanced distribution, we propose an approach for multi-label data augmentation of the minority classes in this used dataset. Then, we propose a specific preprocessing for this Arabic reviews dataset. Our aspect category prediction approach is based on the classifier chains technique. In fact, unlike previous works that treat each label separately, we handle the dependencies between the various labels. Our findings show that our proposed approach achieves a good F1 score that outperforms the pioneering related work approaches.

References

[1]
Ahmed Abdelali, Kareem Darwish, Nadir Durrani, and Hamdy Mubarak. 2016. Farasa: A fast and furious segmenter for arabic. In Proc. of the 2016 conference of the North American chapter of the association for computational linguistics: Demonstrations.
[2]
Saja Al-Dabet, Sara Tedmori, and AL-Smadi Mohammad. 2021. Enhancing Arabic aspect-based sentiment analysis using deep learning models. Computer Speech Language (2021).
[3]
Mohammad Al-Smadi, Omar Qawasmeh, Mahmoud Al-Ayyoub. Yaser Jararweh, and Brij Gupta. 2018. Deep Recurrent neural network vs. support vector machine for aspect-based sentiment analysis of Arabic hotels' reviews. Journal of computational science 27 (2018), 386--393.
[4]
Eiman Alsharhan and Allan Ramsay. 2020. Investigating the effects of gender, dialect, and training size on the performance of Arabic speech recognition. Language Resources and Evaluation (2020).
[5]
Wissam Antoun, Fady Baly, and Hazem Hajj. 2020. Arabert: Transformer-based model for Arabic language understanding. arXiv preprint arXiv:2003.00104 (2020).
[6]
Francisco Charte, Antonio J Rivera, María J del Jesus, and Francisco Herrera. 2015. MLSMOTE: Approaching imbalanced multilabel learning through synthetic instance generation. Knowledge-Based Systems (2015).
[7]
Nitesh V Chawla, Kevin W Bowyer, Lawrence O Hall, and W Philip Kegelmeyer. 2002. SMOTE: synthetic minority over-sampling technique. Journal of artificial intelligence research (2002).
[8]
Krzysztof Dembczynski, Weiwei Cheng, and Eyke Hüllermeier. 2010. Bayes optimal multilabel classification via probabilistic classifier chains. In ICML.
[9]
Pedro Gonnet and Thomas Deselaers. 2020. Indylstms: Independently Recurrent LSTMS. In International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[10]
Sana Hamdi, Ahmed Hamdi, and Sadok Ben Yahia. 2022. BERT and Word Embedding for Interest Mining of Instagram Users. In Advances in Computational Collective Intelligence. Springer International Publishing, Cham, 123--136.
[11]
Mai Ibrahim, Marwan Torki, and Nagwa El-Makky. 2018. Imbalanced toxic comments classification using data augmentation and deep learning. In 2018 17th IEEE international conference on machine learning and applications (ICMLA). IEEE, 875--878.
[12]
Mai Ibrahim, Marwan Torki, and Nagwa M El-Makky. 2020. AlexU-BackTranslation-TL at SemEval-2020 Task 12: Improving offensive language detection using data augmentation and transfer learning. In Proc. of the Fourteenth Workshop on Semantic Evaluation. 1881--1890.
[13]
Tomas Liesting, Flavius Frasincar, and Maria Mihaela Truşcă. 2021. Data augmentation in a hybrid approach for aspect-based sentiment analysis. In Proceedings of the 36th Annual ACM Symposium on Applied Computing. 828--835.
[14]
Maria Pontiki, Dimitrios Galanis, Haris Papageorgiou, Ion Androutsopoulos, Suresh Manandhar, Mohammad Al-Smadi, Mahmoud Al-Ayyoub, Yanyan Zhao, Bing Qin, Orphée De Clercq, et al. 2016. Semeval-2016 task 5: Aspect based sentiment analysis. In International workshop on semantic evaluation. 19--30.
[15]
M Pontiki, D Galanis, H Papageorgiou, S Manandhar, and I Androutsopoulos. 2016. SemEval 2016 task 5: aspect based sentiment analysis (ABSA-16) annotation guidelines. (2016).
[16]
Jesse Read, Bernhard Pfahringer, Geoff Holmes, and Eibe Frank. 2011. Classifier chains for multi-label classification. Machine learning (2011).
[17]
Sebastian Ruder, Parsa Ghaffari, and John G Breslin. 2016. Insight-1 at semeval-2016 task 5: Deep learning for multilingual aspect-based sentiment analysis. (2016).
[18]
Abu Bakr Soliman, Kareem Eissa, and Samhaa R El-Beltagy. 2017. Aravec: A set of Arabic word embedding models for use in Arabic nlp. Procedia Computer Science 117 (2017), 256--265.
[19]
Aleš Tamchyna and Kateŕina Veselovská. 2016. Ufal at semeval-2016 task 5: recurrent neural networks for sentence classification. In Proc. of the international workshop on semantic evaluation.
[20]
Maria Mihaela Truşcă and Flavius Frasincar. 2022. Survey on aspect detection for aspect-based sentiment analysis. Artificial Intelligence Review (2022), 1--50.

Cited By

View all
  • (2025)A comprehensive survey on Arabic text augmentation: approaches, challenges, and applicationsNeural Computing and Applications10.1007/s00521-025-11020-zOnline publication date: 7-Feb-2025
  • (2024)Arabic Aspect Category Detection Using Traditional Neural Networks and Arbert2024 11th International Conference on Wireless Networks and Mobile Communications (WINCOM)10.1109/WINCOM62286.2024.10657409(1-6)Online publication date: 23-Jul-2024
  • (2023)A bőrgyógyászati páciensek komplex pszichodermatológiai ellátásának szükségességeMentálhigiéné és Pszichoszomatika10.1556/0406.2023.0004424:4(307-317)Online publication date: 26-Dec-2023
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
SAC '23: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing
March 2023
1932 pages
ISBN:9781450395175
DOI:10.1145/3555776
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 07 June 2023

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. aspect category
  2. multi-label classification
  3. imbalanced data
  4. preprocessing
  5. arabic hotel reviews
  6. classifier chains

Qualifiers

  • Research-article

Conference

SAC '23
Sponsor:

Acceptance Rates

Overall Acceptance Rate 1,650 of 6,669 submissions, 25%

Upcoming Conference

SAC '25
The 40th ACM/SIGAPP Symposium on Applied Computing
March 31 - April 4, 2025
Catania , Italy

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)14
  • Downloads (Last 6 weeks)1
Reflects downloads up to 16 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2025)A comprehensive survey on Arabic text augmentation: approaches, challenges, and applicationsNeural Computing and Applications10.1007/s00521-025-11020-zOnline publication date: 7-Feb-2025
  • (2024)Arabic Aspect Category Detection Using Traditional Neural Networks and Arbert2024 11th International Conference on Wireless Networks and Mobile Communications (WINCOM)10.1109/WINCOM62286.2024.10657409(1-6)Online publication date: 23-Jul-2024
  • (2023)A bőrgyógyászati páciensek komplex pszichodermatológiai ellátásának szükségességeMentálhigiéné és Pszichoszomatika10.1556/0406.2023.0004424:4(307-317)Online publication date: 26-Dec-2023
  • (2023)Sentiment Analysis for Hotel Reviews: A Systematic Literature ReviewACM Computing Surveys10.1145/360515256:2(1-38)Online publication date: 15-Sep-2023
  • (2023)Enhanced approach of multilabel learning for the Arabic aspect category detection of the hotel reviewsComputational Intelligence10.1111/coin.1260940:1Online publication date: 14-Nov-2023

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media