Abstract
Sentiment analysis is an important field of study in natural language processing. In the massive data and irregular data, sentiment classification with high accuracy is a major challenge in sentiment analysis. To address this problem, a novel maximum entropy-PLSA model is proposed. In this model, we first use the probabilistic latent semantic analysis to extract the seed emotion words from the Wikipedia and the training corpus. Then features are extracted from these seed emotion words, which are the input of the maximum entropy model for training the maximum entropy model. The test set is processed similarly into the maximum entropy model for emotional classification. Meanwhile, the training set and the test set are divided by the K-fold method. The maximum entropy classification based on probabilistic latent semantic analysis uses important emotional classification features to classify words, such as the relevance of words and parts of speech in the context, the relevance with degree adverbs, the similarity with the benchmark emotional words and so on. The experiments prove that the classification method proposed by this paper outperforms the compared methods.
Similar content being viewed by others
References
Berger AL, Pietra VJD, Pietra SAD (1996) A maximum entropy approach to natural language processing. Comput Linguist 22(1):39–71
Brody S, Elhadad N (2009) Restaurant review corpus. http://people.dbmi.columbia.edu/noemie/ursa
Brody S, Elhadad N (2013) An unsupervised aspect-sentiment model for online reviews. In: Human language technologies: conference of the North American chapter of the Association of Computational Linguistics, Proceedings, June 2–4, 2010. Los Angeles, California, USA, pp 804–812
Burges CJC (1998) A tutorial on support vector machines for pattern recognition. Data Min Knowl Discov 2(2):121–167
Cheeseman P, Stutz J (1996) Bayesian classification (autoclass): theory and results. Fayyad U.m.etc. advances in Knowledge Discovery & Data Mining Aaai, pp 153–180
Chen Q, Wenjie Li Y, Lei XL, He Y (2015) Learning to adapt credible knowledge in cross-lingual sentiment analysis. ACL 1:419–429
Cheng K, Li J, Tang J, Liu H (2017) Unsupervised sentiment analysis with signed social networks. In: AAAI, pp 3429–3435
Chen D, Wang D, Yu G, Yu F (2007) A PLSA-based approach for building user profile and implementing personalized recommendation. In: Advances in data and web management. Springer, pp 606–613
Du K, Shi Y, Lei B, Chen J, Sun M (2016) A method of human action recognition based on spatio-temporal interest points and PLSA. In: 2016 international conference on industrial informatics-computing technology, intelligent technology, industrial information integration (ICIICII). IEEE, pp 69–72
Ganu G, Elhadad N, Marian A (2009) Beyond the stars: improving rating predictions using review text content. In: International workshop on the web and databases, WEBDB (2009) Providence. Rhode Island, USA, June
Gehring J, Miao Y, Metze F, Waibel A (2013) Extracting deep bottleneck features using stacked auto-encoders. In: 2013 IEEE international conference on acoustics, speech and signal processing (ICASSP). IEEE, pp 3377–3381
Haidar MA, O’Shaughnessy D (2015) Document-specific context PLSA language model for speech recognition. In: 2015 IEEE international conference on acoustics, speech and signal processing (ICASSP). IEEE, pp 5326–5330
Hofmann T (2001) Unsupervised learning by probabilistic latent semantic analysis. Mach Learn 42(1):177–196
Hong HZ, Hwang JI (2015) Multimodal PLSA for movie genre classification. In: International workshop on multiple classifier systems. Springer, pp 159–167
Huang F, Jing X, Sun S, Lu Y (2012) Incorporate spatial information into PLSA for scene classification. In: International conference on trustworthy computing and services. Springer, pp 170–177
JyFantas (2014) PLSA. https://github.com/JFantasy/plsa
Lipenkova J (2015) A system for fine-grained aspect-based sentiment analysis of Chinese. In: ACL (system demonstrations), pp 55–60
Nguyen TH, Shirai K, Velcin J (2015 Modeling based sentiment analysis on social media for stock market prediction. In: The meeting of the association for computational linguistics and the international joint conference on natural language processing of the Asian Federation of natural language processing
Ni X, Xue GR, Ling X, Yu Y, Yang Q (2007) Exploring in the weblog space by detecting informative and affective articles. In: Proceedings of the 16th international conference on World Wide Web. ACM, pp 281–290
Pang B, Lee L (2002) Movie review data. http://www.cs.cornell.edu/people/pabo/movie-review-data
Pang B, Lee L (2004) A sentimental education: sentiment analysis using subjectivity summarization based on minimum cuts. In: Proceedings of the 42nd annual meeting on Association for Computational Linguistics. Association for Computational Linguistics, p 271
Pang B, Lee L, Vaithyanathan S (2002) Thumbs up? Sentiment classification using machine learning techniques. In: Proceedings of the ACL-02 conference on empirical methods in natural language processing-vol 10. Association for Computational Linguistics, pp 79–86
Toutanova K (2004) Stanford log-linear part-of-speech tagger. http://nlp.stanford.edu/software/tagger.shtml
Wang SY, Hsieh JW, Yan Y, Chen LC, Chen DY (2015a) PLSA-based sparse representation for vehicle color classification. In: 2015 12th IEEE international conference on advanced video and signal based surveillance (AVSS). IEEE, pp 1–6
Wang Y, Wang S, Tang J, Liu H, Li B (2015b) Supervised sentiment analysis for social media images. In: IJCAI, pp 2378–2379
Wang J, Fu J, Xu Y, Mei T (2016) Beyond object recognition: visual sentiment analysis with deep coupled adjective and noun neural networks. In: IJCAI, pp 3484–3490
Wasilewski J, Hurley N (2016) Intent-aware diversification using a constrained PLSA. In: ACM conference on recommender systems, pp 39–42
Xu WR, Liu DX, Guo J, Cai YC et al (2009) Supervised dual-PLSA for personalized SMS filtering. In: Asia information retrieval symposium. Springer, pp 254–264
You Q, Jin H, Luo J (2017) Visual sentiment analysis by attending on local image regions. In: AAAI, pp 231–237
You Q, Luo J, Jin H, Yang J (2015) Robust image sentiment analysis using progressively trained and domain transferred deep networks. arXiv preprint arXiv:1509.06041
Zhang L (2015) A maximum entropy modeling toolkit for python and C++. https://github.com/lzhang10/maxent
Zhang Y, Yuan Y, Guoren W (2015) A multimodal multimedia retrieval model based on PLSA. In: Web information system and application conference, pp 33–36
Zhang M, Zhang Y, Vo DT (2016) Gated neural networks for targeted sentiment analysis. In: AAAI, pp 3087–3093
Zhong C, Miao Z (2014) Modeling correlation between multi-modal continuous words for PLSA-based video classification. In: 2014 IEEE international conference on image processing (ICIP). IEEE, pp 4304–4308
Acknowledgements
This work is supported by the National Natural Science Foundation, under Grant Nos. 61762037, 61640217, 41402290, 61462028, Science and Technology Support Program of Jiangxi Province, under Grant No. 20151BBE50055, and Science and Technology Project supported by education department of Jiangxi Province under Grant No. GJJ150541, and Nanchang City Sensor Network and Compressed Sensing Knowledge Innovation Team under Grant No. 2016T75.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that there is no conflict of interest regarding the publication of this paper.
Additional information
Communicated by V. Loia.
Rights and permissions
About this article
Cite this article
Xie, X., Ge, S., Hu, F. et al. An improved algorithm for sentiment analysis based on maximum entropy. Soft Comput 23, 599–611 (2019). https://doi.org/10.1007/s00500-017-2904-0
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00500-017-2904-0