A hybrid model for opinion mining based on domain sentiment dictionary

Cai, Yi; Yang, Kai; Huang, Dongping; Zhou, Zikai; Lei, Xue; Xie, Haoran; Wong, Tak-Lam

doi:10.1007/s13042-017-0757-6

A hybrid model for opinion mining based on domain sentiment dictionary

Original Article
Published: 12 December 2017

Volume 10, pages 2131–2142, (2019)
Cite this article

International Journal of Machine Learning and Cybernetics Aims and scope Submit manuscript

Yi Cai¹,
Kai Yang²,
Dongping Huang¹,
Zikai Zhou¹,
Xue Lei¹,
Haoran Xie³ &
…
Tak-Lam Wong⁴

934 Accesses
37 Citations
Explore all metrics

Abstract

Sentiment classification is an application of sentiment analysis, which is a popular research field in NLP. It can classify documents into different categories according to their sentiments. For a sentiment classification task, the first step is to extract sentimental features from documents, and then classify them using some classifiers. In the first step, a traditional way to extract sentimental features is to apply sentiment dictionaries. However, sentiment words may have different sentiment tendencies in different contexts, and traditional sentiment dictionaries does not consider this situation where wrong sentiment tendencies may be selected for sentiment words. In our research, we find that sentiment words will not have diverse meanings when they associate with the nearby aspects and entities in documents. Then, we propose a three layers sentiment dictionary, which can associate sentiment words with the corresponding entities and aspects together to reduce their multiple meanings. In the second step of the sentiment classification task, many classification models, such as SVM, GBDT, can be used to classify documents according to the extracted sentiment words. However, different classifiers have different weaknesses. A Stacking-based hybrid model is applied to combine SVM and GBDT together to overcome their weaknesses and reach higher performance. This hybrid model contains two layers, and the output of the first layer will become the input of the second layer. The first layer will generate different classification results according to different classifiers, while the second layer will automatically learn how to select a probable one as the final result. The experimental results show that our hybrid model outperforms the baseline single models.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A classified feature representation three-way decision model for sentiment analysis

Article 16 October 2021

A Cross-Domain Sentiment Classification Method Based on Extraction of Key Sentiment Sentence

A Discriminative Approach to Sentiment Classification

Article 11 September 2019

References

Cambria E (2016) Affective computing and sentiment analysis. IEEE Intell Syst 31(2):102–107
Article Google Scholar
Cavnar WB, Trenkle JM et al (1994) N-gram-based text categorization. Ann Arbor MI 48113(2):161–175
Google Scholar
Dong Z, Dong Q (2006) Hownet and the computation of meaning. World Scientific, Singapore
Book Google Scholar
Friedman JH (2001) Greedy function approximation: a gradient boosting machine. Ann Stat 29:1189–1232
Article MathSciNet MATH Google Scholar
Friedman JH (2002) Stochastic gradient boosting. Comput Stat Data Anal 38(4):367–378
Article MathSciNet MATH Google Scholar
Fu Z, Huang F, Sun X, Vasilakos A, Yang C-N (2016) Enabling semantic search based on conceptual graphs over encrypted outsourced data. IEEE Trans Serv Comput PP:1–1
Google Scholar
Fu Z, Ren K, Shu J, Sun X, Huang F (2016) Enabling personalized search over encrypted outsourced data with efficiency improvement. IEEE Trans Parallel Distrib Syst 27(9):2546–2559
Article Google Scholar
Fu Z, Wu X, Guan C, Sun X, Ren K (2016) Toward efficient multi-keyword fuzzy search over encrypted outsourced data with accuracy improvement. IEEE Trans Inf Forensics Secur 11(12):2706–2716
Article Google Scholar
Goldberg Y, Levy O (2014) word2vec explained: deriving Mikolov et al.’s negative-sampling word-embedding method. arXiv preprint arXiv:1402.3722
Hofmann T (1999) Probabilistic latent semantic indexing. In: Proceedings of the 22nd annual international ACM SIGIR conference on research and development in information retrieval, SIGIR ’99. ACM, New York, NY, USA, pp 50–57
Ko Y (2012) A study of term weighting schemes using class information for text classification. In: Proceedings of the 35th international ACM SIGIR conference on research and development in information retrieval, ACM, pp 1029–1030
Lan M, Tan CL, Su J, Lu Y (2009) Supervised and traditional term weighting methods for automatic text categorization. Pattern Anal Mach Intell IEEE Trans 31(4):721–735
Article Google Scholar
Leopold E, Kindermann J (2002) Text categorization with support vector machines. How to represent texts in input space? Mach Learn 46(1–3):423–444
Article MATH Google Scholar
Liu Bing (2012) Sentiment analysis and opinion mining. Synth Lect Hum Lang Technol 5(1):1–167
Article Google Scholar
Liu B, Hu M, Cheng J (2005) Opinion observer: analyzing and comparing opinions on the web. In: Proceedings of the 14th international conference on world wide web, WWW ’05. ACM, New York, NY, USA, pp 342–351
Mikolov T, Sutskever I, Chen K, Corrado GS, Dean J (2013) Distributed representations of words and phrases and their compositionality. In: NIPS’13 Proceedings of the 26th international conference on neural information processing systems, vol 2, 5–10 Dec 2013, Lake Tahoe, Nevada, pp 3111–3119
Paik JH (2013) A novel tf-idf weighting scheme for e ective ranking. In: Proceedings of the 36th international ACM SIGIR conference on research and development in information retrieval, SIGIR ’13. ACM, New York, NY, USA, pp 343–352
Pang B, Lee L (2008) Opinion mining and sentiment analysis. Found Trends Inf Retr 2(1–2):1–135
Article Google Scholar
Papadimitriou CH, Tamaki H, Raghavan P, Vempala S (1998) Latent semantic indexing: a probabilistic analysis. In: Proceedings of the seventeenth ACM SIGACT–SIGMOD–SIGART symposium on principles of database systems, ACM, pp 159–168
Quan X, Wenyin L, Qiu B (2011) Term weighting schemes for question categorization. Pattern Anal Mach Intell IEEE Trans 33(5):1009–1021
Article Google Scholar
Rabiner L, Juang B (1986) An introduction to hidden Markov models. IEEE ASSP Mag 3(1):4–16
Article Google Scholar
Sparck Jones K (1972) A statistical interpretation of term specificity and its application in retrieval. J Doc 28(1):11–21
Article Google Scholar
Wang G, Hao J, Ma J, Jiang H (2011) A comparative assessment of ensemble learning for credit scoring. Expert Syst Appl 38(1):223–230
Article Google Scholar
Wang T, Cai Y, Leung H, Cai Z, Min H (2015) Entropy-based term weighting schemes for text categorization in VSM. In: Tools with artificial intelligence (ICTAI), 2015 IEEE 27th international conference. IEEE, Vietri sul Mare, Italy, pp 325–332
Xia Z, Wang X, Sun X, Wang Q (2016) A secure and dynamic multi-keyword ranked search scheme over encrypted cloud data. IEEE Trans Parallel Distrib Syst 27(2):340–352
Article Google Scholar
Xue B, Fu C, Shaobin Z (2014) A study on sentiment computing and classification of sina weibo with word2vec. In: Big Data (BigData Congress), 2014 IEEE international congress. IEEE, Anchorage, AK, USA, pp 358–363
Yang K, Cai Y, Huang D, Li J, Zhou Z, Lei X (2017) An effective hybrid model for opinion mining and sentiment analysis. In: Big data and smart computing (BigComp), 2017 IEEE international conference. IEEE, Jeju, South Korea, pp 465–466

Download references

Acknowledgements

This work is supported by the Fundamental Research Funds for the Central Universities, SCUT (No. 2017ZD048), Tiptop Scientific and Technical Innovative Youth Talents of Guangdong special support program (No. 2015TQ01X633), Science and Technology Planning Project of Guangdong Province, China (No. 2017B050506004), Science and Technology Program of Guangzhou (International Science & Technology Cooperation Program No. 201704030076), and the Internal Research Grant (RG 66/2016-2017) and the Funding Support to ECS Proposal (RG 23/2017-2018R) of The Education University of Hong Kong.

Author information

Authors and Affiliations

South China University of Technology, Guangzhou, China
Yi Cai, Dongping Huang, Zikai Zhou & Xue Lei
City University of Hong Kong, Hong Kong, Hong Kong
Kai Yang
The Education University of Hong Kong, Hong Kong, Hong Kong
Haoran Xie
Douglas College, New Westminster, Canada
Tak-Lam Wong

Authors

Yi Cai
View author publications
You can also search for this author in PubMed Google Scholar
Kai Yang
View author publications
You can also search for this author in PubMed Google Scholar
Dongping Huang
View author publications
You can also search for this author in PubMed Google Scholar
Zikai Zhou
View author publications
You can also search for this author in PubMed Google Scholar
Xue Lei
View author publications
You can also search for this author in PubMed Google Scholar
Haoran Xie
View author publications
You can also search for this author in PubMed Google Scholar
Tak-Lam Wong
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yi Cai.

Additional information

The preliminary version of this article has been published in ASC 2017 conjunction with BIGCOMP 2017 [27].

Rights and permissions

Reprints and permissions

About this article

Cite this article

Cai, Y., Yang, K., Huang, D. et al. A hybrid model for opinion mining based on domain sentiment dictionary. Int. J. Mach. Learn. & Cyber. 10, 2131–2142 (2019). https://doi.org/10.1007/s13042-017-0757-6

Download citation

Received: 01 April 2017
Accepted: 01 December 2017
Published: 12 December 2017
Issue Date: 01 August 2019
DOI: https://doi.org/10.1007/s13042-017-0757-6

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A hybrid model for opinion mining based on domain sentiment dictionary

Abstract

Access this article

Similar content being viewed by others

A classified feature representation three-way decision model for sentiment analysis

A Cross-Domain Sentiment Classification Method Based on Extraction of Key Sentiment Sentence

A Discriminative Approach to Sentiment Classification

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A hybrid model for opinion mining based on domain sentiment dictionary

Abstract

Access this article

Similar content being viewed by others

A classified feature representation three-way decision model for sentiment analysis

A Cross-Domain Sentiment Classification Method Based on Extraction of Key Sentiment Sentence

A Discriminative Approach to Sentiment Classification

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation