Skip to main content
Log in

Sentiment analysis of Chinese online reviews using ensemble learning framework

  • Published:
Cluster Computing Aims and scope Submit manuscript

Abstract

Unstructured online reviews are undergoing a rather rapid expansion with the development of E-commerce, and they contain sentiment information in which consumers and businesses are very interested. Therefore, effective sentiment classification has become one of the important research topics. Many studies have shown that ensemble learning methods may have great hopeful applicability in sentiment classification tasks. In this paper, we propose a new ensemble learning framework for sentiment classification of Chinese online reviews. First of all, according to the complicated characteristics of Chinese online reviews, we extract Part of Speech Combination Pattern, Frequent Word Sequence Pattern and Order Preserved Submatrix Pattern as the input features. Furthermore, we use the algorithm of Random Subspace based on Information Gain by considering the problem of massive features in the reviews, which can improve the base classifiers simultaneously. Finally, we adopt the algorithm of Constructing Base Classifiers based on Product Attributes to combine the sentiment information of each attribute in a review so as to obtain better performance on sentiment classification. The experimental results show that the proposed ensemble learning framework has significant improvement in sentiment classification of Chinese online reviews.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

References

  1. Xu, R., Wong, K, Xia, Y.: Coarse-fine opinion mining-WIA in NTCIR-7 moat task. In: Proceedings of NTCIR-7 Workshop Meeting, pp. 307–313 (2008)

  2. Pang, B., Lee, L., Vaithyanathan, S.: Thumbs up? Sentiment classification using machine learning techniques. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 79–86 (2002)

  3. Tan, S., Zhang, J.: An empirical study of sentiment analysis for chinese documents. Expert Syst. Appl. 34(4), 2622–2629 (2008)

    Article  Google Scholar 

  4. Liu, Y.: Computational Linguistics. Tsinghua University Press, Beijing (2002)

    Google Scholar 

  5. Xia, R., Zong, C., Li, S.: Ensemble of feature sets and classification algorithms for sentiment classification. Inf. Sci. 181(6), 1138–1152 (2011)

    Article  Google Scholar 

  6. Sivic, J., Zisserman, A.: Efficient visual search of videos cast as text retrieval. IEEE Trans. Pattern Anal. Mach. Intell. 31(4), 591–606 (2009)

    Article  Google Scholar 

  7. Harris, Z.S.: Distributional structure. Word 10(2–3), 146–162 (1954)

    Google Scholar 

  8. Ho, T.K.: The random subspace method for constructing decision forests. IEEE Trans. Pattern Anal. Mach. Intell. 20(8), 832–844 (1998)

    Article  Google Scholar 

  9. Yang, L.G., Zhu, J., Tian, S.P.: Survey of text sentiment analysis. J. Comput. Appl. 33, 1574–1607 (2013)

    Google Scholar 

  10. Turney P.D.: Thumbs up or thumbs down? Semantic orientation applied to unsupervised classification of reviews. In: Meeting on Association for Computational Linguistics. Association for Computational Linguistics, pp. 417–424 (2002)

  11. Hu, M., Liu, B.: Mining and summarizing customer reviews. In: Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 168–177 (2004)

  12. Salton, G., Yu, C.T.: On the Construction of Effective Vocabularies for Information Retrieval. ACM SIGIR Forum, pp. 48–60. ACM, New York (1973)

    Google Scholar 

  13. Bengio, Y., Ducharme, R., Vincent, P., et al.: A neural probabilistic language model. J. Mach. Learn. Res. 3(6), 1137–1155 (2003)

    MATH  Google Scholar 

  14. Mikolov, T., Chen K., Corrado G., et al.: Efficient estimation of word representations in vector space. In: Computer Science (2013)

  15. Gui, L., Zhou, Y., Xu, R., et al.: Learning representations from heterogeneous network for sentiment classification of product reviews. Knowl. Based Syst. 124, 34–45 (2017)

    Article  Google Scholar 

  16. Chen, T., Xu, R., He, Y., et al.: Improving sentiment analysis via sentence type classification using BiLSTM-CRF and CNN. Expert Syst. Appl. 72, 221–230 (2017)

    Article  Google Scholar 

  17. Polikar, R.: Ensemble based systems in decision making. IEEE Circ. Syst. Mag. 6(3), 21–44 (2006)

    Article  Google Scholar 

  18. Fang, D., Wang, G.: Text sentiment classification based on ensemble learning. Comput. Syst. Appl. 07, 177–181+248 (2012)

  19. Wu, C.C.: Sentiment classification method based on ensemble learning for Chinese micro-blog. Public Commun. Sci. Technol. 16, 235–236+192 (2014)

  20. Wang, G., Sun, J., Ma, J., et al.: Sentiment classification: the contribution of ensemble learning. Decis. Support Syst. 57(1), 77–93 (2004)

    Google Scholar 

  21. Alnashwan, R., O’Riordan, A.P., Sorensen, H., et al.: Improving sentiment analysis through ensemble learning of meta-level features. In: KDWEB 2016: 2nd International Workshop on Knowledge Discovery on the Web. Sun SITE Central Europe (CEUR)/RWTH Aachen University, Aachen (2016)

  22. Deriu, J., Gonzenbach, M., Uzdilli F., et al.: SwissCheese at SemEval-2016 Task 4: sentiment classification using an ensemble of convolutional neural networks with distant supervision. In: SemEval@ NAACL-HLT, pp. 1124–1128 (2006)

  23. Liu, H.Y., Zhao, Y.Y., Qin, B, et al.: Comment target extraction and sentiment classification. J. Chin. Inf. Process. 01, 84–88+122 (2010)

  24. Gao, L., Dai, X.Y., Huang, S.J., et al.: Product attribute extraction based on feature selection and pointwise mutual information pruning. Pattern Recog. Artif. Intell. 02, 187–192 (2015)

    Google Scholar 

  25. Matsumoto, S., Takamura, H., Okumura, M.: Sentiment classification using word sub-sequences and dependency sub-trees. In: Advances in Knowledge Discovery and Data Mining, pp. 301–311 (2005)

  26. Pei, J., Han, J., Mortazavi-Asl, B., et al.: Mining sequential patterns by pattern-growth: the prefixspan approach. IEEE Trans. Knowl. Data Eng. 16(11), 1424–1440 (2004)

    Article  Google Scholar 

  27. Liu, Z., Xue, Y., Li, M., et al.: Discovery of deep order-preserving submatrix in DNA microarray data based on sequential pattern mining. Int. J. Data Mining Bioinform. 17, 217–237 (2017)

    Article  Google Scholar 

  28. Dietterich, T.G.: An experimental comparison of three methods for constructing ensembles of decision trees: bagging, boosting, and randomization. Mach. Learn. 40(2), 139–157 (2000)

    Article  Google Scholar 

  29. Agrawal, R., Srikant, R.: Mining Sequential Patterns. ICDE, vol. 3. IEEE Computer Society, Washington, DC (1995)

    Google Scholar 

  30. Hu, M., Liu, B.: Opinion feature extraction using class sequential rules. In: AAAI Spring Symposium, pp. 61–66 (2006)

  31. Li, J., Sun M.: Experimental study on sentiment classification of Chinese review using machine learning techniques. In: International Conference on Natural Language Processing and Knowledge Engineering, 2007. NLP-KE 2007, vol. 2007, pp. 393–400. IEEE (2007)

  32. Liu, Y., Chen, F., Kong, W., et al.: Identifying web spam with the wisdom of the crowds. ACM Trans. Web (TWEB) 6(1), 1–30 (2012)

    Article  Google Scholar 

  33. Pedregosa, F., Varoquaux, G., Gramfort, A., et al.: Scikit-learn: machine learning in python. J. Mach. Learn. Res. 12, 2825–2830 (2011)

    MathSciNet  MATH  Google Scholar 

  34. Witten, I.H., Frank, E., Hall, M.A.: Data Mining: Practical Machine Learning Tools and Techniques. Morgan Kaufmann Publishers, Burlington (2011)

    Google Scholar 

  35. Abadi, M., Agarwal, A., Barham, P., et al.: Tensorflow: large-scale machine learning on heterogeneous distributed systems. arXiv preprint. arXiv:1603.04467 (2016)

  36. Dong, Z., Dong, Q.: HowNet—a hybrid language and knowledge resource. In: Proceedings of the 2003 International Conference on Natural Language Processing and Knowledge Engineering, 2003, pp. 820–824. IEEE (2003)

  37. Yuan, B., Liu, Y., Li, H.: Sentiment classification in Chinese microblogs: lexicon-based and learning-based approaches. Int. Proc. Econ. Dev. Res. 68, 1 (2013)

    Google Scholar 

Download references

Acknowledgements

The authors thank gratefully for the colleagues participated in this work and provided technical supports. This work is supported by Grant from the National Natural Science Foundation of China (No. 61672126), Guangdong Provincial Engineering Technology Research Center for Data Science (Nos. 2016KF09, 2016KF10), and the National Statistical Science Research Project of China (No. 2016LY98). This work was also supported by the Science and Technology Department of Guangdong Province in China (Grant Nos. 2016A010101020, 2016A010101021, 2016A010101022), Foundation of Guangdong Polytechnic of Science and Technology (No. XJSC2016206), Natural Science Funds of Shenzhen Science and Technology Innovation Commission (No. JCYJ20160527172144272) and the Innovation Project of Graduate School of South China Normal University (No. 2015lkxm37). Furthermore, the authors thank gratefully for the scholars who shared datasets used in this work.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yun Xue.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Huang, J., Xue, Y., Hu, X. et al. Sentiment analysis of Chinese online reviews using ensemble learning framework. Cluster Comput 22 (Suppl 2), 3043–3058 (2019). https://doi.org/10.1007/s10586-018-1858-z

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10586-018-1858-z

Keywords

Navigation