Skip to main content
Log in

A Valence-Totaling Model for Vietnamese sentiment classification

  • Original Paper
  • Published:
Evolving Systems Aims and scope Submit manuscript

Abstract

Many researchers, applications and fields of study have researched and used many works concerning the sentiment classification. Each model (or method) of the sentiment analysis has many advantages and many disadvantages. Thus, we see that the opinion classification is an extremely important field of research. In this study, we have proposed a Valence-Totaling Model for Vietnamese (called VTMfV, a new model for Vietnamese sentiment classification) to classify many Vietnamese documents. First of all, we built a new Vietnamese sentiment dictionary which contains sentiment-bearing Vietnamese words such as negative Vietnamese words, positive Vietnamese words and neutral Vietnamese words. The Jaccard Measure (JM) is a similarity measure between two words (or two vectors); our Vietnamese sentiment dictionary has been created using JM. We call the Vietnamese sentiment dictionary “VSD_JM”. JM has been used in many researches of the English sentiment classification; however, it has not yet been used in any study of the Vietnamese sentient classification. From this moment, JM can be applied for the researches of the Vietnamese sentiment analysis. Then, our VTMfV has used our VSD_JM to classify the Vietnamese documents. We have processed all kinds of Vietnamese sentences. Finally, we have used the VTMfV to classify 30,000 Vietnamese documents which include the 15,000 positive Vietnamese documents and the 15,000 negative Vietnamese documents. We have achieved accuracy in 63.9% of our Vietnamese testing data set. VTMfV is not dependent on the special domain. VTMfV is also not dependent on the training data set and there is no training stage in this VTMfV. From our results in this work, our VTMfV can be applied in the different fields of the Vietnamese natural language processing. In addition, our TCMfV can be applied to many other languages such as Spanish, Korean, etc. It can also be applied to the big data set sentiment classification in Vietnamese and can classify millions of the Vietnamese documents.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

References

  • Agarwal B, Mittal N (2016) Machine learning approach for sentiment analysis, prominent feature extraction for sentiment analysis. pp 21–45. Print ISBN 978-3-319-25341-1. doi:10.1007/978-3-319-25343-5_3

  • Agarwal B, Mittal N (2016) Semantic orientation-based approach for sentiment analysis, prominent feature extraction for sentiment analysis. pp 77–88. Print ISBN 978-3-319-25341-1. doi:10.1007/978-3-319-25343-5_6

  • Ahmed S, Danti A (2016) Effective sentimental analysis and opinion mining of web reviews using rule based classifiers. In: Computational intelligence in data mining, India vol 1, pp 171–179. Print ISBN 978-81-322-2732-8. doi:10.1007/978-81-322-2734-2_18

  • An NTT, Hagiwara M (2014) Adjective-based estimation of short sentence’s impression. In: International Conference on Kansei Engineering and Emotion Research, Keer2014, Linköping.

  • Bach NX, Van PD, Tai ND, Phuong TM (2015) Mining Vietnamese comparative sentences for sentiment analysis. In: 2015 Seventh International Conference on Knowledge and Systems Engineering (KSE), pp 162–167.

  • Ban DQ (2005) Vietnamese Grammar. Education Publisher, Vietnam

  • Ban DQ (2013) Vietnam Grammar. Education Publisher, Vietnam

  • Bang TS, Haruechaiyasak C, Sornlertlamvanich V (2015) Vietnamese sentiment analysis based on term feature selection approach. In: Proceedings of The Tenth International Conference on Knowledge, Information and Creativity Support Systems (KICSS2015), Phuket, Thailand, November 12–14.

  • Ben-Shimon D, Rokach L, Shani G, Shapira B (2016) Anytime algorithms for recommendation service providers. In: ACM transactions on intelligent systems and technology (TIST)—regular papers, survey papers and special issue on recommender system benchmarks, vol 7, issue 3, New York, USA

  • Booma PM, Prabhakaran S (2016) Classification of genes for disease idxentification using data mining techniques. J Theor Appl Inf Technol 83(3) (ISSN: 1992-8645).

  • Borchardt V, Lord AR, Li M, van der Meer J, Heinze HJ, Bogerts B, Breakspear M, Walter M (2016) Preprocessing strategy influences graph-based exploration of altered functional networks in major depression. Human Brain Map 37(4):1422–1442

    Article  Google Scholar 

  • Canuto S, Gonçalves MA, Benevenuto F (2016) Exploiting new sentiment-based meta-level features for effective sentiment analysis. In: Proceedings of the Ninth ACM International Conference on Web Search and Data Mining (WSDM ‘16), New York USA, pp 53–62.

  • Chamberlain BP, Levy-Kramer J, Humby C, Deisenroth MP (2016) Real-time association mining in large social networks, social and information networks.

  • Chen LS, Chiu HJ (2009) Developing a neural network based index for sentiment classification. In: Proceedings of the International MultiConference of Engineers and Computer Scientists, Hong Kong

  • Cimiano P, Wenderoth J (2007) Automatic acquisition of ranked qualia structures from the web. Proceedings of the 45th annual meeting of the association of computational linguistics. Czech Republic, Prague, pp 888–895

    Google Scholar 

  • Dat H, Doi TT, Lan DT (1998) Vietnamese Establishments. Eduational Publisher, Vietnam

    Google Scholar 

  • Duyen NT, Bach NX, Phuong TM (2014) An empirical study on sentiment analysis for Vietnamese. In: 2014 International Conference on Advanced Technologies for Communications (ATC), pp 309–314.

  • Efron M (2004) Cultural orientation: classifying subjective documents by cociation sic analysis. In: Proceedings of the AAAI Fall Symposium on Style and Meaning in Language, Art, Music, and Design, pp 41–48

  • Feng S, Zhang L, Li B, Wang D, Yu G, Wong KF (2013) Is Twitter a better corpus for measuring sentiment similarity? In: Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, Seattle, Washington, USA, pp 897–902

  • Ha QT, Vu TT, Pham HT, Luu CT (2011) An upgrading feature-based opinion mining model on Vietnamese product reviews. In: Proceedings of the 7th international conference on Active media technology (AMT 11), pp 173–185.

  • Hao CX (1991) Vietnamese: draft, grammatical function. Social Science Publisher, Vietnam

  • Khan F, Fatima M, Alvi UT, Jilani T, Fatima U (2016) comparative study of similarity measures in link prediction using Facebook data. Int J Comp Sci Inf Secur 132–143

  • Kieu BT, Pham SB (2010) Sentiment analysis for Vietnamese. In: 2010 Second International Conference on Knowledge and Systems Engineering (KSE), pp 152–157

  • Kundi FM, Khan A, Asghar MZ, Ahamd S (2015) Context-aware spelling corrector for sentiment analysis. MAGNT Res Rep 2(6):1–11

    Google Scholar 

  • LACVIET dictionary software (2017) http://www.lacviet.vn/san-pham/tudienlacviet

  • Le HS, Le TV, Pham TV (2015) Aspect analysis for opinion mining of Vietnamese text. In: 2015 International Conference on Advanced Computing and Applications (ACOMP)

  • Le HS, Lee JH, Lee HK (2015) Applying machine learning to classify sentiment text for Vietnamese language on social network data. In: The Korea Society of Management information Systems, pp 709–714

  • LINGOES dictionary software (2017) http://www.lingoes.net/

  • Lu G, Huang P, He L, Cu C, Li X (2010) A new semantic similarity measuring method based on web search engines. J WSEAS Trans Comput 9 (1)

  • Manek AS, Shenoy PD, Chandra Mohan M, Venugopal KR (2016) Aspect term extraction for sentiment analysis in large movie reviews using Gini Index feature selection method and SVM classifier. World Wide Web, USA, pp 1–20. Print ISSN1386-145X. doi:10.1007/s11280-015-0381-x

  • Mao H, Gao P, Wang Y, Bollen J (2014) Automatic construction of financial semantic orientation Lexicon from large-scale Chinese news corpus. In: The 7th Financial Risks International Forum

  • Nadaf M, Lahane S, Deshpande A, Tirth S (2015) Using business intelligence for mining online reviews for predicting sales performance. Int J Eng Comput Sci 4(5):11718–11717 (ISSN:2319-7242)

  • Nguyen TC (1998) Vietnamese Grammar. Vietnam National University Publisher, Vietnam

  • Nguyen NY, Van Khang N, Hao VQ, Thanh PX (2010) Great Dictionary of Vietnamese. Ho Chi Minh City National University Publisher, Vietnam

  • Nguyen DQ, Nguyen DQ, Vu T, Pham SB (2014) Sentiment classification on polarity reviews: an empirical study using rating-based features. In: Proceedings of the 5th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis, 2014, pp 128–135

  • Nguyen HM, Le TV, Le HS, Pham TV (2014) Domain specific sentiment dictionary for opinion mining of Vietnamese text. Multi-discip Trends Artif Intell 136–148

  • Phan DH, Cao TD (2014) Applying skip-gram word estimation and SVM-based classification for opinion mining Vietnamese food places text reviews. In: Proceedings of the Fifth Symposium on Information and Communication Technology (SoICT 14), New York, USA, pp 232–239

  • Phe H, Linh HTT, Luong VX (2015) Vietnamese Dictionary 2015. Da Nang Publisher, Vietnam

  • Phu VN (2017) A valences-totaling model for English sentiment classification. Knowledge and Information Systems. doi:10.1007/s10115-017-1054-0

  • Phu VN, Tuoi PT (2014) Sentiment classification using enhanced contextual valence shifters. In: International Conference on Asian Language Processing (IALP), pp 224–229.

  • Phu VN, Chau VTN, Tran VTN, Dat ND (2017a) A Vietnamese adjective emotion dictionary based on exploitation of Vietnamese language characteristics. Int J Artif Intell Rev (AIR). doi:10.1007/s10462-017-9538-6

    Google Scholar 

  • Phu VN, Chau VTN, Tran VTN, Dat ND, Nguyen TA (2017b) STING algorithm used english sentiment classification in a parallel environment. Int J Patt Recognit Artif Intell 31(7):30. doi:10.1142/S0218001417500215

    Google Scholar 

  • Phu VN, Chau VTN, Tran VTN, Dat ND (2017) A C4.5 algorithm for english emotional classification. Int J Evol Syst. doi:10.1007/s12530-017-9180-1

  • Ramli N, Mohammed N, Shohaimay F (2016) Jaccard ranking index with algebraic product t-norm based on second function principle in handling fuzzy risk analysis problem. In: Regional Conference on Science, Technology and Social Sciences (RCSTSS 2014), pp 231–239

  • Rothfels J, Tibshirani J (2010) Unsupervised sentiment classification of english movie reviews using automatic selection of positive and negative sentiment items. CS224N-Final Project

  • Sneha B, Mohit D, Singh VZ (2016) Comparison of different similarity functions on Hindi QA system. In: Proceedings of International Conference on ICT for Sustainable Development, pp 657–663.

  • Song J, He Y, Fu G (2015) Polarity classification of short product reviews via multiple cluster-based SVM classifiers. In: 29th Pacific Asia Conference on Language, Information and Computation: Posters, Shanghai, China, pp 267–274

  • Taboada M, Anthony C, Voll K (2006) Methods for creating semantic orientation dictionaries. In: Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC 2006), pp 427–432, Genoa, Italy

  • TLNET Vietnamese Dictionary (2017) http://www.tlnet.com.vn/tu-dien-tieng-viet/

  • Tran VTN, Phu VP, Tuoi PT (2014) Learning more Chi square feature selection to improve the fastest and most accurate sentiment classification. In: The Third Asian Conference on Information Systems, ACIS 2014

  • Trinh S, Nguyen L, Vo M, Do P (2016) Lexicon-based sentiment analysis of facebook comments in Vietnamese language. In: Recent Developments in Intelligent Information and Database Systems, pp 263–276.

  • Turney P (2002) Thumbs up or thumbs down? Semantic orientation applied to unsupervised classification of reviews. In: Proceedings of 40th ACL, pp 417–424.

  • Turney PD, Littman ML (2003) Measuring praise and criticism: inference of semantic orientation from association. ACM Trans Inf Syst (TOIS) 21(4):315–346

    Article  Google Scholar 

  • Van Anh TT, Dau HX (2014) A crossed-domain sentiment analysis system for the discovery of current careers from social networks. In: Proceedings of the Fifth Symposium on Information and Communication Technology (SoICT 14), New York, USA, pp 226–231

  • VDict Vietnamese Dictionary (2017) http://vdict.com/

  • Vietnam Social Science Commission (1993) Vietnamese grammar. Social Science Publisher, Ha Noi

    Google Scholar 

  • Voll K, Taboada M (2007) Not all words are created equal: extracting semantic orientation as a function of adjective relevance. In: Proceedings of the 20th Australian Joint Conference on Artificial Intelligence, Gold Coast, Australia, pp 337–346

  • Vu XS, Park SB (2014) Construction of Vietnamese SentiWordNet by using Vietnamese dictionary. In: The 40th Conference of the Korea Information Processing Society, South Korea, pp 745–748.

  • Wang G, Araki K (2007) Modifying SO-PMI for JapaneseWeblog opinion mining by using a balancing factor and detecting neutral expressions. In: Proceedings of NAACL HLT 2007, Companion Volume, pp 189–192

  • Yuen RWM, Chan TYW, Lai TBY, Kwong OY, T’sou BKY (2004) Morpheme-based derivation of bipolar semantic orientation of Chinese words. In: Proceedings of the 20th International Conference on Computational Linguistics, Stroudsburg, PA, USA

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Vo Ngoc Phu.

Appendix

Appendix

See Tables 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 and 11.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Phu, V.N., Chau, V.T.N., Tran, V.T.N. et al. A Valence-Totaling Model for Vietnamese sentiment classification. Evolving Systems 10, 453–499 (2019). https://doi.org/10.1007/s12530-017-9187-7

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12530-017-9187-7

Keywords

Navigation