Machine learning-based new approach to films review

Jassim, Mustafa Abdalrassual; Abd, Dhafar Hamed; Omri, Mohamed Nazih

doi:10.1007/s13278-023-01042-7

Machine learning-based new approach to films review

Original Article
Published: 02 March 2023

Volume 13, article number 40, (2023)
Cite this article

Social Network Analysis and Mining Aims and scope Submit manuscript

553 Accesses
9 Citations
Explore all metrics

Abstract

The main purpose of Sentiment Analysis (SA) is to derive useful insights from large amounts of unstructured data compiled from various sources. This analysis helps to interpret and classify textual data using different techniques applied in machine learning (ML) models. In this paper, we compared simple and ensemble ML methods as classifiers for SA: Random Forest, K-Nearest Neighbor, Artificial Neural Network, Gradient Boosting, Support Vector Machine (SVM), AdaBoost, Extreme Gradient Boosting, Decision Tree, Light GBM, Stochastic Gradient Descent and Bagging. For this, we considered a test set database of 50,000 movie reviews, of which 25,000 were rated positive and 25,000 negatives. We have chosen 20,000 words that have an impact on the feelings of the documents. This work aims to propose a new rating prediction approach based on a textual customer review. We consider term frequency characteristics and term frequency-inverse document frequency from the large-scale and serial trials to compare the results obtained by various classifiers using feature extraction techniques. For the decision phase, we applied the Fuzzy Decision by Opinion Score Method, one of the most recent methods for multi-criteria decision-making. To evaluate and quantify the performance of the different ML methods we considered, we apply six standard measures namely precision, accuracy, recall, F-score, AUC, and Kappa-measure. The results we obtained, at the end of the experimental work that we conducted, indicated that the SVM classier is the best with 88,333% as a precision rate followed by the FDOSM method, with 0.800 for the same measurement.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A survey on sentiment analysis methods, applications, and challenges

Article 07 February 2022

A review on sentiment analysis and emotion detection from text

Article 28 August 2021

Sentiment Analysis in the Age of Generative AI

Article Open access 05 March 2024

Notes

www.kaggle.com.

References

Ababneh J (2019) Application of Naïve Bayes, decision tree, and k-nearest neighbors for automated text classification. Mod Appl Sci 13(11):31
Article Google Scholar
Ahmed MA, Al-Qaysi ZT, Shuwandy ML, Salih MM, Ali MH (2021) Automatic COVID-19 pneumonia diagnosis from X-ray lung image: a deep feature and machine learning solution. J Phys Conf Ser 1963:012099
Article Google Scholar
Albahri OS, Zaidan AA, Salih MM, Zaidan BB, Khatari MA, Ahmed MA, Albahri AS, Alazab M (2020) Multidimensional benchmarking of the active queue management methods of network congestion control based on extension of fuzzy decision by opinion score method. Int J Intell Syst 36(2):796–831
Article Google Scholar
Albahri OS, Zaidan AA, Salih MM, Zaidan BB, Khatari MA, Ahmed MA, Albahri AS, Alazab M (2021) Multidimensional benchmarking of the active queue management methods of network congestion control based on extension of fuzzy decision by opinion score method. Int J Intell Syst 36(2):796–831
Article Google Scholar
Albahri AS, Albahri OS, Zaidan AA, Alnoor A, Alsattar HA, Mohammed R, Alamoodi AH, Zaidan BB, Aickelin U, Alazab M et al (2022) Integration of fuzzy-weighted zero-inconsistency and fuzzy decision by opinion score methods under a q-rung orthopair environment: a distribution case study of COVID-19 vaccine doses. Comput Stand Interfaces 80:103572
Article Google Scholar
Al-Qaysi ZT, Ahmed MA, Hammash NM, Hussein AF, Albahri AS, Suzani MS, Al-Bander B (2022) A systematic rank of smart training environment applications with motor imagery brain-computer interface. Multimedia Tools Appl
Al-Samarraay MS, Salih MM, Ahmed MA, Zaidan AA, Albahri OS, Pamucar D, AlSattar HA, Alamoodi AH, Zaidan BB, Dawood K et al (2022) A new extension of FDOSM based on pythagorean fuzzy environment for evaluating and benchmarking sign language recognition systems. Neural Comput Appl 34(6):4937–4955
Article Google Scholar
Al-Samarraay MS, Zaidan AA, Albahri OS, Pamucar D, AlSattar HA, Alamoodi AH, Zaidan BB, Albahri AS (2022) Extension of interval-valued pythagorean FDOSM for evaluating and benchmarking real-time SLRSS based on multidimensional criteria of hand gesture recognition and sensor glove perspectives. Appl Soft Comput 116:108284
Article Google Scholar
Behzadian M, Otaghsara SK, Yazdani M, Ignatius J (2012) A state-of the-art survey of TOPSIS applications. Expert Syst Appl 39(17):13051–13069
Article Google Scholar
Bennett S (2016) Predicting elections with twitter: what 140 characters reveal about political sentiment
Cahyanti FE, Adiwijaya FSA (2020) On the feature extraction for sentiment analysis of movie reviews based on SVM. In: 2020 8th international conference on information and communication technology (ICoICT). IEEE
Campanella G, Ribeiro RA (2011) A framework for dynamic multiple-criteria decision making. Decis Support Syst 52(1):52–60
Article Google Scholar
Cano AE, Preotiuc-Pietro D, Radovanović D, Weller K, Dadzie A-S (2016) #microposts2016. In: Proceedings of the 25th international conference companion on world wide web—WWW’16 companion. ACM Press
Çelen A (2014) Comparative analysis of normalization procedures in TOPSIS method: with an application to Turkish deposit banking market. Informatica 25(2):185–208
Article MathSciNet Google Scholar
Fadhli I, Hlaoua L, Omri MN(2022) Sentiment analysis CSAM model to discover pertinent conversations in twitter microblogs. I. J Comput Netw Inf Secur 28–46
Feldman R (2013) Techniques and applications for sentiment analysis. Commun ACM 56(4):82–89
Article Google Scholar
Gammoudi F, Sendi M, Omri MN (2022) A survey on social media influence environment and influencers identification. Soc Netw Anal Min 12(1):1–19
Article Google Scholar
Garfan S, Alamoodi AH, Zaidan BB, Al-Zobbi M, Hamid RA, Alwan JK, Ahmaro IYY, Khalid ET, Jumaah FM, Albahri OS et al (2021) Telehealth utilization during the COVID-19 pandemic: a systematic review. Comput Biol Med 138:104878
Article Google Scholar
Grinsztajn L, Oyallon E, Varoquaux G (2022) Why do tree-based models still outperform deep learning on tabular data? arXiv preprint arXiv:2207.08815
Haddad O, Fkih F, Omri MN (2022) Machine learning analytics-based distributed frameworks: a survey
Hasebrook N, Morsbach F, Kannengießer N, Franke J, Hutter F, Sunyaev A (2022) Why do machine learning practitioners still use manual tuning? A qualitative study. arXiv preprint arXiv:2203.01717
Hossain MdI, Rahman M, Ahmed T, Islam AZMT (2021) Forecast the rating of online products from customer text review based on machine learning algorithms. In: 2021 international conference on information and communication technology for sustainable development (ICICT4SD). IEEE, pp 6–10
Hudson S, Huang L, Roth MS, Madden TJ (2016) The influence of social media interactions on consumer-brand relationships: a three-country study of brand perceptions and marketing behaviors. Int J Res Market 33(1):27–41
Article Google Scholar
Hutto C, Gilbert E (2014) VADER: a parsimonious rule-based model for sentiment analysis of social media text. In: Proceedings of the international AAAI conference on web and social media, vol 8(1), pp 216–225
Jannach D, Moreira G de Souza P, Oldridge E (2020) Why are deep learning models not consistently winning recommender systems competitions yet? A position paper. In: Proceedings of the recommender systems challenge 2020, pp 44–49
Jannik K, Neil B, Clare L, Aidan NG, Thomas R, Yarin G (2021) Self-attention between datapoints: going beyond individual input-output pairs in deep learning. Adv Neural Inf Process Syst 34:28742–28756
Google Scholar
Japhne A, Murugeswari R (2020) Opinion mining based complex polarity shift pattern handling for improved sentiment classification. In: 2020 international conference on inventive computation technologies (ICICT). IEEE
Jassim MA (2021) Analysis of the performance of the main algorithms for educational data mining: a review. In: IOP conference series: materials science and engineering. IOP Publishing, vol 1090, p 012084
Kabir M, Jahangir MM, Kabir SX, Badhon B (2021) An empirical research on sentiment analysis using machine learning approaches. Int J Comput Appl 43(10):1011–1019
Google Scholar
Kaur J, Saini JR (2017) Punjabi poetry classification: the test of 10 machine learning algorithms. In: Proceedings of the 9th international conference on machine learning and computing, pp 1–5
Khan FH, Qamar U, Bashir S (2016) A semi-supervised approach to sentiment analysis using revised sentiment strength based on SentiWordNet. Knowl Inf Syst 51(3):851–872
Article Google Scholar
Kiritchenko S, Mohammad SM (2016) Sentiment composition of words with opposing polarities. In: Proceedings of the 2016 conference of the North American chapter of the association for computational linguistics: human language technologies. Association for Computational Linguistics
Kornyshova E, Salinesi C (2007) MCDM techniques selection approaches: state of the art. In: 2007 IEEE symposium on computational intelligence in multi-criteria decision-making. IEEE, pp 22–29
Kumar RS, Saviour DAF, Rajeswari M, Julie EG, Robinson YH, Shanmuganathan V (2021) Exploration of sentiment analysis and legitimate artistry for opinion mining. Multimedia Tools Appl 1–16
Larsen P, Von Ins M (2010) The rate of growth in scientific publication and the decline in coverage provided by science citation index. Scientometrics 84(3):575–603
Article Google Scholar
Li Z, Fan Y, Jiang B, Lei T, Liu W (2019) A survey on sentiment analysis and opinion mining for social multimedia. Multimedia Tools Appl 78(6):6939–6967
Article Google Scholar
Liu B (2012) Sentiment analysis and opinion mining. Synth Lect Hum Lang Technol 5(1):1–167
Article MathSciNet Google Scholar
Liu Y, Huang X, An A, Yu X (2007) ARSA: a sentiment-aware model for predicting sales performance using blogs. In: Proceedings of the 30th annual international ACM SIGIR conference on research and development in information retrieval, pp 607–614
Machado MR, Karray S, de Sousa IT(2019) LightGBM: an effective decision tree gradient boosting method to predict customer loyalty in the finance industry. In: 2019 14th international conference on computer science education (ICCSE). IEEE
Mahdavi I, Mahdavi-Amiri N, Heidarzade A, Nourifar R (2008) Designing a model of fuzzy TOPSIS in multiple criteria decision making. Appl Math Comput 206(2):607–617
MathSciNet Google Scholar
Mahjouri M, Ishak MB, Torabian A, Abd ML, Halimoon N, Ghoddusi J (2017) Optimal selection of iron and steel wastewater treatment technology using integrated multi-criteria decision-making techniques and fuzzy logic. Process Saf Environ Protect 107:54–68
Article Google Scholar
Mahmoud US, Albahri AS, AlSattar HA, Zaidan AA, Talal M, Mohammed RA, Albahri OS, Zaidan BB, Alamoodi AH, Hadi SM (2021) A methodology of DASS benchmarking to support industrial community characteristics in designing and implementing advanced driver assistance systems within vehicles
Malek YA, Alexander G, Abdul RSF (2018) Selection of alternatives using fuzzy networks with rule base aggregation. Fuzzy Sets Syst 341:123–144
Article MathSciNet Google Scholar
Mamun MdMR, Sharif O, Mohammed MH (2021) Classification of textual sentiment using ensemble technique. SN Comput Sci 3(1):521
Google Scholar
Mäntylä MV, Graziotin D, Kuutila M (2018) The evolution of sentiment analysis-a review of research topics, venues, and top cited papers. Comput Sci Rev 27:16–32
Article Google Scholar
Mtetwa N, Awukam AO, Yousefi M (2018) Feature extraction and classification of movie reviews. In: 2018 5th international conference on soft computing machine intelligence (ISCMI). IEEE
Mustafa AJ (2018) Performance analysis of a keyword search system. J Univ Babylon Eng Sci 26(3):146–152
MathSciNet Google Scholar
Nakov P (2016) Sentiment analysis in twitter: a SemEval perspective. In: Proceedings of the 7th workshop on computational approaches to subjectivity, sentiment and social media analysis. Association for Computational Linguistics
Namugera F, Wesonga R, Jehopio P (2019) Text mining and determinants of sentiments: Twitter social media usage by traditional media houses in Uganda. Comput Soc Netw 6(1)
O’Connor B, Balasubramanyan R, Routledge BR, Smith NA (2010) From tweets to polls: Linking text sentiment to public opinion time series. In: Fourth international AAAI conference on weblogs and social media
Opricovic S, Tzeng G-H (2004) Compromise solution by MCDM methods: a comparative analysis of VIKOR and TOPSIS. Eur J Oper Res 156(2):445–455
Article Google Scholar
Ouni S, Fkih F, Omri MN (2022) BERT-and CNN-based Tobeat approach for unwelcome tweets detection. Soc Netw Anal Min 12(1):1–19
Article Google Scholar
Ouni S, Fkih F, Omri MN (2022) Novel semantic and statistic features-based author profiling approach. J Amb Intell Hum Comput 1–17
Pang B, Lee L (2004) A sentimental education: sentiment analysis using subjectivity summarization based on minimum cuts. arXiv preprint arXiv:cs/0409058
Patel NV, Chhinkaniwala H (2022) Investigating machine learning techniques for user sentiment analysis. In: Research anthology on machine learning techniques, methods, and applications. IGI Global, pp 681–692
Qi Z (2020) The text classification of theft crime based on TF-IDF and XGBoost model. In: 2020 IEEE international conference on artificial intelligence and computer applications (ICAICA). IEEE
Rafay A, Suleman M, Alim A (2020) Robust review rating prediction model based on machine and deep learning: Yelp dataset. In: 2020 international conference on emerging trends in smart technologies (ICETST). IEEE
Roy D, Dutta M (2022) Optimal hierarchical attention network-based sentiment analysis for movie recommendation. Soc Netw Anal Min 12(1):138
Article Google Scholar
Salih MM, Zaidan BB, Zaidan AA (2020) Fuzzy decision by opinion score method. Appl Soft Comput 96:106595
Article Google Scholar
Salminen J, Yoganathan V, Corporan J, Jansen BJ, Jung S-G (2019) Machine learning approach to auto-tagging online content for content marketing efficiency: a comparative analysis between methods and content type. J Bus Res 101:203–217
Article Google Scholar
Sarawgi K, Pathak V (2017) Opinion mining: aspect level sentiment analysis using SentiWordNet and amazon web services. Int J Comput Appl 158(6):31–36
Google Scholar
Sharma S, Srivastava S, Kumar A, Dangi A (2018) Multi-class sentiment analysis comparison using support vector machine (SVM) and BAGGING technique-an ensemble method. In: 2018 international conference on smart computing and electronic enterprise (ICSCEE). IEEE
Shaukat Z, Zulfiqar AA, Xiao C, Azeem M, Mahmood T (2020) Sentiment analysis on IMDB using lexicon and neural networks. SN App Sci 2(2):1–10
Google Scholar
Singh RK, Benyoucef L (2011) A fuzzy TOPSIS based approach for e-sourcing. Eng Appl Artif Intell 24(3):437–448
Article Google Scholar
Tripathy A, Anand A, Kadyan V (2022) Sentiment classification of movie reviews using GA and NEUROGA. Multimedia Tools Appl 1–21
Yang W, Fu Y, Zhang D (2016) An improved parallel algorithm for text categorization. In: 2016 international symposium on computer, consumer and control (IS3C). IEEE
Yano T, Smith NA (2010) What’s worthy of comment? Content and comment volume in political blogs. In: Fourth international AAAI conference on weblogs and social media
Yan B, Yang Z, Ren Y, Tan X, Liu E (2017) Microblog sentiment classification using parallel SVM in apache spark. In: 2017 IEEE international congress on big data (BigData Congress). IEEE
Zaidan AA, Zaidan BB, Hussain M, Haiqi A, Kiah MLM, Abdulnabi M (2015) Multi-criteria analysis for OS-EMR software selection problem: a comparative study. Decis Support Syst 78:15–27
Article Google Scholar
Zughoul O, Zaidan AA, Zaidan BB, Albahri OS, Alazab M, Amomeni U, Albahri AS, Salih MM, Mohammed RT, Mohammed KI et al (2021) Novel triplex procedure for ranking the ability of software engineering students based on two levels of AHP and group TOPSIS techniques. Int J Inf Technol Decis Mak 20(01):67–135
Article Google Scholar

Download references

Author information

Authors and Affiliations

MARS Research Laboratory, University of Sousse, Sousse, Tunisia
Mustafa Abdalrassual Jassim & Mohamed Nazih Omri
Monastir Faculty of Science, University of Monastir, Monastir, Tunisia
Mustafa Abdalrassual Jassim
Al-Muthanna University, Samawah, Iraq
Mustafa Abdalrassual Jassim
Department of Computer Science, Al-Maaref University College, Alanbar, Iraq
Dhafar Hamed Abd

Authors

Mustafa Abdalrassual Jassim
View author publications
You can also search for this author in PubMed Google Scholar
Dhafar Hamed Abd
View author publications
You can also search for this author in PubMed Google Scholar
Mohamed Nazih Omri
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

MAA, DHA, and MNO contributed to the study’s conception and design. The first draft of the manuscript was written by all the authors. All authors commented on previous versions of the manuscript. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Mustafa Abdalrassual Jassim.

Ethics declarations

Conflict of interest

The authors declare no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Jassim, M.A., Abd, D.H. & Omri, M.N. Machine learning-based new approach to films review. Soc. Netw. Anal. Min. 13, 40 (2023). https://doi.org/10.1007/s13278-023-01042-7

Download citation

Received: 16 January 2023
Revised: 08 February 2023
Accepted: 17 February 2023
Published: 02 March 2023
DOI: https://doi.org/10.1007/s13278-023-01042-7

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Machine learning-based new approach to films review

Abstract

Access this article

Similar content being viewed by others

A survey on sentiment analysis methods, applications, and challenges

A review on sentiment analysis and emotion detection from text

Sentiment Analysis in the Age of Generative AI

Notes

References

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Machine learning-based new approach to films review

Abstract

Access this article

Similar content being viewed by others

A survey on sentiment analysis methods, applications, and challenges

A review on sentiment analysis and emotion detection from text

Sentiment Analysis in the Age of Generative AI

Notes

References

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation