Skip to main content
Log in

A big data approach to sentiment analysis using greedy feature selection with cat swarm optimization-based long short-term memory neural networks

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

Sentiment analysis is crucial in various systems such as opinion mining and predicting. Considerable research has been done to analyze sentiment using various machine learning techniques. However, the high error rates in these studies can reduce the entire system’s efficiency. We introduce a novel big data and machine learning technique for evaluating sentiment analysis processes to overcome this problem. The data are collected from a huge volume of datasets, helpful in the effective analysis of systems. The noise in the data is eliminated using a preprocessing data mining concept. From the cleaned sentiment data, effective features are selected using a greedy approach that selects optimal features processed by an optimal classifier called cat swarm optimization-based long short-term memory neural network (CSO-LSTMNN). The classifiers analyze sentiment-related features according to cat behavior, minimizing error rate while examining features. This technique helps improve system efficiency, analyzed using experimental results of error rate, precision, recall, and accuracy. The results obtained by implementing the greedy feature and CSO-LSTMNN algorithm and the particle swarm optimization (PSO) algorithm are compared; CSO-LSTMNN outperforms PSO in terms of increasing accuracy and decreasing error rate.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

References

  1. Zhang L, Liu B (2017) Sentiment analysis and opinion mining. In: Sammut C, Webb GI (eds) Encyclopedia of machine learning and data mining. Springer, Boston, MA. https://doi.org/10.1007/978-1-4899-7687-1_907

    Google Scholar 

  2. Lee G, un Jeong J, Seo S, Kim C (2018) Sentiment classification with word localization based on weakly supervised learning with a convolutional neural network. Knowl Based Syst. https://doi.org/10.1016/j.knosys.2018.04.006

    Google Scholar 

  3. Pang B, Lee L (2008) Opinion mining and sentiment analysis. Found Trends Inf Retr 2:1–135

    Google Scholar 

  4. Bhatia S, Sharma M, Bhatia KK (2018) Sentiment analysis and mining of opinions. Internet of things and big data analytics toward next-generation intelligence. Springer, Cham, pp 503–523

    Google Scholar 

  5. Tolba A, Elashkar E (2018) Soft computing approaches based bookmark selection and clustering techniques for social tagging systems. Cluster Comput 1–7. https://doi.org/10.1007/s10586-018-2014-5

    Google Scholar 

  6. Liu Y, Gao C, Zhang Z, Lu Y, Chen S, Liang M, Tao L (2017) Solving NP-hard problems with Physarum-based ant colony system. IEEE/ACM Trans Comput Biol Bioinf 14:108–120

    Google Scholar 

  7. Nabaei A, Hamian M, Parsaei MR, Safdari R, Samad-Soltani T, Zarrabi H, Ghassemi A (2018) Topologies and performance of intelligent algorithms: a comprehensive review. Artif Intell Rev 49:79–103

    Google Scholar 

  8. Roy S, Biswas S, Chaudhuri SS (2014) Nature-inspired swarm intelligence and its applications. Int J Mod Educ Comp Sci 12:55–65

    Google Scholar 

  9. Mahi M, Baykan OK, Kodaz H (2018) A new approach based on particle swarm optimization algorithm for solving data allocation problem. Appl Soft Comput 62:571–578

    Google Scholar 

  10. Pandey HM, Rajput M, Mishra V (2018) Performance comparison of pattern search, simulated annealing, genetic algorithm and jaya algorithm. Data engineering and intelligent computing. Springer, Singapore, pp 377–384

    Google Scholar 

  11. Gill SS, Buyya R, Chana I, Singh M, Abraham A (2018) BULLET: particle swarm optimization based scheduling technique for provisioned cloud resources. J Netw Sys Manag 26:361–400

    Google Scholar 

  12. Bhalla R, Jain P (2016) A model based on effective and intelligent sentiment mining: a review. Indian J Sci Technol 9:32

    Google Scholar 

  13. Nikitidis S, Nikolaidis N, Pitas I (2012) Multiplicative update rules for incremental training of multiclass support vector machines. Pattern Recognit 45:1838–1852

    MATH  Google Scholar 

  14. Kingma DP, Mohamed S, Rezende DJ, Welling M (2014) Semi-supervised learning with deep generative models. Adv Neural Inf Proc Sys 2:3581–3589

    Google Scholar 

  15. Isaac T, García S, Herrera F (2015) Self-labeled techniques for semi-supervised learning: taxonomy, software and empirical study. Knowl Inf Sys 42:245–284

    Google Scholar 

  16. Lafferty J, McCallum A, Pereira F (2001) Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In: Proceedings of the International Conference on Machine Learning, pp 282–289

  17. Astorino A, Fuduli A (2015) Support vector machine polyhedral separability in semi supervised learning. J Optim Theory Appl 164:1039–1050

    MathSciNet  MATH  Google Scholar 

  18. Zhang Z, Zhao M, Chow TWS (2015) Graph based constrained semi-supervised learning framework via label propagation over adaptive neighborhood. IEEE Trans Knowl Data Eng 27:2362–2376

    Google Scholar 

  19. Subramanya A, Bilmes J (2011) Semi-supervised learning with measure propagation. J Mach Learn Res 12:3311–3370

    MathSciNet  MATH  Google Scholar 

  20. Cecotti H (2016) Active graph based semi-supervised learning using image matching: application to handwritten digit recognition. Pattern Recognit Lett. 73:76–82

    Google Scholar 

  21. Patel H, Thakur GS (2016) A hybrid weighted nearest neighbor approach to mine imbalanced data. In: Proceeding 12th International Conference Data Mining (ICDM). IEEE, Las Vegas, pp 106–111

  22. Lu J, Behbood V, Hao P, Zuo H, Xue S, Zhang G (2015) Transfer learning using computational intelligence: a survey. Knowl-Based Sys 80:14–23

    Google Scholar 

  23. Perlich C, Dalessandro B, Raeder T, Stitelman O, Provost F (2015) Machine learning for targeted display advertising: transfer learning in action. Mach Learn 95:103–127

    MathSciNet  Google Scholar 

  24. Long M, Wang J, Ding G, Pan SJ, Yu PS (2014) Adaptation regularization: a general framework for transfer learning. IEEE Trans Knowl Data Eng 26:1076–1089

    Google Scholar 

  25. Wang B, Pineau J (2016) Online boosting algorithms for anytime transfer and multitask learning. In: Proceedings 29th AAAI Conference Artificial Intelligence, AAAI, Austin, pp 3038–3044

  26. Kumar A, Khorwal R (2017) Firefly algorithm for feature selection in sentiment analysis. Computational intelligence in data mining. Springer, Singapore, pp 693–703

    Google Scholar 

  27. Nayak J, Naik B, Behera HS (2016) A novel nature inspired firefly algorithm with higher order neural network: performance analysis. Eng Sci Technol 19:197–211

    Google Scholar 

  28. Chakraborty B, Kawamura A (2018) A new penalty-based wrapper fitness function for feature subset selection with evolutionary algorithms. J Inf Telecommun 2:1–18. https://doi.org/10.1080/24751839.2018.1423792

    Google Scholar 

  29. La L, Cao S, Qin L (2018) Take full advantage of unlabeled data for sentiment classification. Kybernetes 47:474–486

    Google Scholar 

  30. Black PE (2005) Greedy algorithm. Dictionary of Algorithms and Data Structures. U.S, National Institute of Standards and Technology (NIST), Gaithersburg

    Google Scholar 

  31. Hazewinkel M (ed) (2001) [1994] Greedy algorithm. Encyclopedia of mathematics. Springer/Kluwer Academic Publishers, Dordrecht. ISBN 978-1-55608-010-4

    Google Scholar 

  32. Gers FA, Schmidhuber E (2001) LSTM recurrent networks learn simple context-free and context-sensitive languages. IEEE Trans Neural Netw 12:1333–1340. https://doi.org/10.1109/72.963769.ISSN1045-9227

    Google Scholar 

  33. Yang X-S, Sadat Hosseini SS, Gandomi AH (2012) Firefly algorithm for solving non-convex economic dispatch problems with valve loading effect. Appl Soft Comput 12:1180–1186

    Google Scholar 

  34. Kumar A, Mishra D (2013) Cat swarm based optimization of gene expression data classification. Int J Comp Trends Technol (IJCTT) 4:1185

    Google Scholar 

  35. Meysam O, Yasin O, Mohammad M, Mohammad T (2013) A novel cat swarm optimization algorithm for unconstrained optimization problems. Int J Inf Technol Comp Sci 11:32–41

    Google Scholar 

Download references

Acknowledgements

This research was supported by King Saud University, Deanship of Scientific Research, Community College Research Unit.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Amr Tolba.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Alarifi, A., Tolba, A., Al-Makhadmeh, Z. et al. A big data approach to sentiment analysis using greedy feature selection with cat swarm optimization-based long short-term memory neural networks. J Supercomput 76, 4414–4429 (2020). https://doi.org/10.1007/s11227-018-2398-2

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11227-018-2398-2

Keywords

Navigation