FCE-SVM: a new cluster based ensemble method for opinion mining from social media

Wang, Gang; Zheng, Daqing; Yang, Shanlin; Ma, Jian

doi:10.1007/s10257-017-0352-0

FCE-SVM: a new cluster based ensemble method for opinion mining from social media

Original Article
Published: 18 July 2017

Volume 16, pages 721–742, (2018)
Cite this article

Information Systems and e-Business Management Aims and scope Submit manuscript

Gang Wang^1,2,3,
Daqing Zheng^4,5,
Shanlin Yang^1,2 &
…
Jian Ma³

588 Accesses
5 Citations
Explore all metrics

Abstract

Opinion mining aiming to automatically detect subjective information has raised more and more interests from both academic and industry fields in recent years. In order to enhance the performance of opinion mining, some ensemble methods have been investigated and proven to be effective theoretically and empirically. However, cluster based ensemble method is paid less attention to in the area of opinion mining. In this paper, a new cluster based ensemble method, FCE-SVM, is proposed for opinion mining from social media. Based on the philosophy of divide and conquer, FCE-SVM uses fuzzy clustering module to generate different training sub datasets in the first stage. Then, base learners are trained based on different training datasets in the second stage. Finally, fusion module is employed to combine the results of based learners. Moreover, the multi-domain opinion datasets were investigated to verify the effectiveness of proposed method. Empirical results reveal that FCE-SVM gets the best performance through reducing bias and variance simultaneously. These results illustrate that FCE-SVM can be used as a viable method for opinion mining.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Systematic Review on Supervised and Unsupervised Machine Learning Algorithms for Data Science

A Review on Random Forest: An Ensemble Classifier

A survey on ensemble learning

Article 30 August 2019

References

Abbasi A, Chen H, Salem A (2008a) Sentiment analysis in multiple languages: feature selection for opinion classification in web forums. ACM Trans Inf Syst (TOIS) 26(3):12
Article Google Scholar
Abbasi A, Chen H, Thoms S, Fu T (2008b) Affect analysis of web forums and blogs using correlation ensembles. IEEE Trans Knowl Data Eng 20(9):1168–1180
Article Google Scholar
Blitzer J, Dredze M, Pereira F (2007) Biographies, bollywood, boom-boxes and blenders: domain adaptation for sentiment classification. ACL 31(2):440–447
Boiy E, Moens M-F (2009) A machine learning approach to sentiment analysis in multilingual web texts. Inf Retr 12(5):526–558
Article Google Scholar
Breiman L (1996) Bagging predictors. Mach Learn 24(2):123–140
Google Scholar
Cambria E, Schuller B, Xia Y, Havasi C (2013) New avenues in opinion mining and sentiment analysis. IEEE Intell Syst 28(2):15–21
Article Google Scholar
Chen H, Yang C (2011) Special issue on social media analytics: understanding the pulse of the society. Syst Man Cybern Part A Syst Hum IEEE Trans 41(5):826–827
Article Google Scholar
Chern C-C, Wei C-P, Shen F-Y, Fan Y-N (2015) A sales forecasting model for consumer products based on the influence of online word-of-mouth. Inf Syst E-Bus Manag 13(3):445–473
Article Google Scholar
Chiu SL (1994) Fuzzy model identification based on cluster estimation. J Intell Fuzzy syst 2(3):267–278
Article Google Scholar
Cover T, Hart P (1967) Nearest neighbor pattern classification. Inf Theory IEEE Trans 13(1):21–27
Article Google Scholar
Dang Y, Zhang Y, Chen H (2010) A lexicon-enhanced method for sentiment classification: an experiment on online product reviews. Intell Syst IEEE 25(4):46–53
Article Google Scholar
Dave K, Lawrence S, Pennock DM (2003) Mining the peanut gallery: opinion extraction and semantic classification of product reviews. In: Proceedings of the 12th international conference on world wide web. ACM, pp 519–528
Demšar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7:1–30
Google Scholar
Dietterich TG (2000) An experimental comparison of three methods for constructing ensembles of decision trees: bagging, boosting, and randomization. Mach Learn 40(2):139–157
Article Google Scholar
García-Pedrajas N (2009) Constructing ensembles of classifiers by means of weighted instance selection. Neural Netw IEEE Trans 20(2):258–277
Article Google Scholar
Iman RL, Davenport JM (1980) Approximations of the critical region of the fbietkan statistic. Commun Stat Theory Methods 9(6):571–595
Article Google Scholar
Isa D, Lee LH, Kallimani VP, Rajkumar R (2008) Text document preprocessing with the Bayes formula for classification using the support vector machine. IEEE Trans Knowl Data Eng 20(9):1264–1272
Article Google Scholar
Joachims T (1998) Text categorization with support vector machines: learning with many relevant features. In: Machine learning (ECML-98), pp 137–142
Kim S-M, Hovy E (2004) Determining the sentiment of opinions. In: Proceedings of the 20th international conference on computational linguistics. Association for Computational Linguistics, p 1367
Kohavi R, Wolpert DH (1996) Bias plus variance decomposition for zero-one loss functions. In: Proceedings of the 13th international conference on machine learning, pp 275–283
Leopold E, Kindermann J (2002) Text categorization with support vector machines. how to represent texts in input space? Mach Learn 46(1):423–444
Article Google Scholar
Li W, WANG W, Chen Y (2012) Heterogeneous ensemble learning for chinese sentiment classification. J Inf Comput Sci 9(15):4551–4558
Google Scholar
Liu L, Zsu MT (2009) Encyclopedia of database systems. Springer, Berlin
Book Google Scholar
Lu B, Tsou BK (2010) Combining a large sentiment lexicon and machine learning for subjectivity classification. In: Proceedings of the IEEE 2010 international conference on machine learning and cybernetics, pp 3311–3316
Pal NR, Bezdek JC (1995) On cluster validity for the fuzzy C-means model. Fuzzy Syst IEEE Trans 3(3):370–379
Article Google Scholar
Pang B, Lee L (2008) Opinion mining and sentiment analysis. Found Trends Inf Retr 2(1–2):1–135
Article Google Scholar
Pang B, Lee L, Vaithyanathan S (2002) Thumbs up?: Sentiment classification using machine learning techniques. In: Proceedings of the ACL-02 conference on empirical methods in natural language processing. Association for Computational Linguistics, pp 79–86
Polikar R (2006) Ensemble based systems in decision making. Circuits Syst Mag IEEE 6(3):21–45
Article Google Scholar
Prabowo R, Thelwall M (2009) sentiment analysis: a combined approach. J Informetr 3(2):143–157
Article Google Scholar
Quinlan JR (1993) C4. 5: programs for machine learning. Morgan Kaufmann Press, San Mateo, CA, United States
Google Scholar
Rish I (2001) An empirical study of the naive Bayes classifier, pp 41–46
Sebastiani F (2002) Machine learning in automated text categorization. ACM Comput Surv 34(1):1–47
Article Google Scholar
Su Y, Zhang Y, Ji D, Wang Y, Wu H (2013) Ensemble learning for sentiment classification. In: Ji D, Xiao G (eds) Chinese lexical semantics. Springer, Berlin, Heidelberg, pp 84–93
Subrahmanian VS, Reforgiato D (2008) Ava: adjective-verb-adverb combinations for sentiment analysis. Intell Syst IEEE 23(4):43–50
Article Google Scholar
Thelwall M, Buckley K (2013) Topic—based sentiment analysis for the social web: the role of mood and issue—related words. J Am Soc Inf Sci Technol 64(8):1608–1617
Article Google Scholar
Thelwall M, Buckley K, Paltoglou G (2012) Sentiment strength detection for the social web. J Am Soc Inf Sci Technol 63(1):163–173
Article Google Scholar
Thet TT, Na J-C, Khoo CS (2010) Aspect-based sentiment analysis of movie reviews on discussion boards. J Inf Sci 36(6):823–848
Article Google Scholar
Tsutsumi K, Shimada K, Endo T (2007) Movie review classification based on a multiple classifier. In: Proceedings of the 21th Pacific Asia conference on language, information and computation, pp 481–488
Turney PD (2002) Thumbs up or thumbs down?: Semantic orientation applied to unsupervised classification of reviews. In: Proceedings of the 40th annual meeting on association for computational linguistics. Association for Computational Linguistics, pp 417–424
Vapnik VN (2000) The nature of statistical learning theory. Springer, NY, United States
Book Google Scholar
Wang G, Hao J, Ma J, Jiang H (2011a) A comparative assessment of ensemble learning for credit scoring. Expert Syst Appl 38(1):223–230
Article Google Scholar
Wang G, Ma J, Yang S (2011b) Igf-bagging: information gain based feature selection for bagging. Int J Innov Comput Inf Control 7(11):6247–6259
Google Scholar
Wang G, Sun J, Ma J, Xu K, Gu J (2014) Sentiment classification: the contribution of ensemble learning. Decis Support Syst 57(1):77–93
Article Google Scholar
Whitehead M, Yaeger L (2010) Sentiment mining using ensemble classification models. In: Sobh T (ed) Innovations and advances in computer sciences and engineering. Springer, Berlin, pp 509–514
Wilson T, Wiebe J, Hwa R (2006) Recognizing strong and weak opinion clauses. Comput Intell 22(2):73–99
Article Google Scholar
Windeatt T, Ardeshir G (2004) Decision tree simplification for classifier ensembles. Int J Pattern Recognit Artif Intell 18(5):749–776
Article Google Scholar
Witten IH, Frank E, Hall MA (2011) Data mining: practical machine learning tools and techniques. Morgan Kaufmann Press, Cambridge, MA, United States
Google Scholar
Wolpert DH, Macready WG (1997) No free lunch theorems for optimization. IEEE Trans Evol Comput 1(1):67–82
Article Google Scholar
Xia R, Zong C, Li S (2011) Ensemble of feature sets and classification algorithms for sentiment classification. Inf Sci 181(6):1138–1152
Article Google Scholar
Yang C-S, Chen C-H, Chang P-C (2015) Harnessing consumer reviews for marketing intelligence: a domain-adapted sentiment classification approach. Inf Syst E-Bus Manag 13(3):403–419
Article Google Scholar
Yi J, Nasukawa T, Bunescu R, Niblack W (2003) sentiment analyzer: extracting sentiments about a given topic using natural language processing techniques. In: Third IEEE international conference on data mining ICDM 2003, pp 427–434
Zhang C, Zeng D, Li J, Wang FY, Zuo W (2009) Sentiment analysis of chinese documents: from sentence to document level. J Am Soc Inf Sci Technol 60(12):2474–2487
Article Google Scholar
Zhou Z-H (2012) Ensemble methods: foundations and algorithms. Chapman & Hall/CRC Press, NY, United States
Book Google Scholar

Download references

Acknowledgements

This work is partially supported by the National Natural Science Foundation of China (Nos. 71101042, 71471054), the National Program on Key Basic Research Project (973 Program) (No. 2013CB329603), Specialized Research Fund for the Doctoral Program of Higher Education (20110111120014), the China Postdoctoral Science Foundation (2011M501041, 2013T60611).

Author information

Authors and Affiliations

School of Management, Hefei University of Technology, Hefei, 230009, Anhui, People’s Republic of China
Gang Wang & Shanlin Yang
Key Laboratory of Process Optimization and Intelligent Decision-making, Ministry of Education, Hefei, Anhui, People’s Republic of China
Gang Wang & Shanlin Yang
Department of Information Systems, City University of Hong Kong, Tat Chee Avenue, Kowloon, Hong Kong
Gang Wang & Jian Ma
Room 208, School of Information Management and Engineering, SUFE, No. 100, Wudong Road, Shanghai, 200433, People’s Republic of China
Daqing Zheng
Shanghai Key Laboratory of Financial Information Technology, SUFE, Shanghai, 200433, People’s Republic of China
Daqing Zheng

Authors

Gang Wang
View author publications
You can also search for this author in PubMed Google Scholar
Daqing Zheng
View author publications
You can also search for this author in PubMed Google Scholar
Shanlin Yang
View author publications
You can also search for this author in PubMed Google Scholar
Jian Ma
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Daqing Zheng.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wang, G., Zheng, D., Yang, S. et al. FCE-SVM: a new cluster based ensemble method for opinion mining from social media. Inf Syst E-Bus Manage 16, 721–742 (2018). https://doi.org/10.1007/s10257-017-0352-0

Download citation

Received: 19 January 2015
Revised: 24 June 2015
Accepted: 15 July 2015
Published: 18 July 2017
Issue Date: November 2018
DOI: https://doi.org/10.1007/s10257-017-0352-0

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

FCE-SVM: a new cluster based ensemble method for opinion mining from social media

Abstract

Access this article

Similar content being viewed by others

A Systematic Review on Supervised and Unsupervised Machine Learning Algorithms for Data Science

A Review on Random Forest: An Ensemble Classifier

A survey on ensemble learning

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

FCE-SVM: a new cluster based ensemble method for opinion mining from social media

Abstract

Access this article

Similar content being viewed by others

A Systematic Review on Supervised and Unsupervised Machine Learning Algorithms for Data Science

A Review on Random Forest: An Ensemble Classifier

A survey on ensemble learning

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation