Skip to main content
Log in

Shifting semantic values of English phrases for classification

  • Published:
International Journal of Speech Technology Aims and scope Submit manuscript

Abstract

The researches of semantics (positive, negative, neutral) are performed for a long time and they are very important for many commercial applications, many scientific works, etc. In this paper we propose a new model to calculate the emotional values (or semantic scores) of English terms (English verbs, English nouns, English adjectives, English adverbs, etc.) as follows: firstly, we create our basis English emotional dictionary (called bEED) by using Sorensen measure (Sorensen coefficient, called SM) through Google search engine with AND operator and OR operator and secondly, many English adjective phrases, English adverb phrases and English verb phrases are created based on the English grammars (the English characteristics) by combining the English adverbs of degree with the English adjectives, the English adverbs and English verbs; finally, the valences of the English adverb phrases are identified by their specific contexts. The English phrases often bring the semantics which the values (or emotional scores) are not fixed and are changed when they appear in their different contexts. Therefore, the results of the sentiment classification are not high accuracy if the English phrases bring the emotions and their semantic values (or their sentiment scores) are not changed in any context. For those reasons, we propose many rules based on English language grammars to calculate the sentimental values of the English phrases bearing emotion in their specific contexts. The results of this work are widely used in applications and researches of the English semantic classification.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

References

  • Agarwal, B., & Mittal, N. (2016a). Semantic orientation-based approach for sentiment analysis. Prominent feature extraction for sentiment analysis, (pp. 77–88). Cham: Springer International Publishing. doi:10.1007/978-3-319-25343-5$46, Print ISBN 978-3-319-25341-1.

  • Agarwal, B., & Mittal, N. (2016b). Machine learning approach for sentiment analysis. Prominent feature extraction for sentiment analysis, (pp. 21–45). Cham: Springer International Publishing. doi:10.1007/978-3-319-25343-5$43, Print ISBN 978-3-319-25341-1.

  • Ahmed, S., & Danti, A. (2016). Effective sentimental analysis and opinion mining of web reviews using rule based classifiers. Computational intelligence in data mining, Vol 1, (pp. 171–179). doi:10.1007/978-81-322-2734-2$418, Print ISBN 978-81-322-2732-8.

  • An, N. T. T., & Hagiwara, M. (2014). Adjective-based estimation of short sentence’s impression, (KEER2014). In Proceedings of the 5th Kanesi engineering and emotion research; international conference. Sweden: Linköpings universitet

  • Andreevskaia, A., & Bergler S. (2006). Mining wordnet for fuzzy sentiment: sentiment tag extraction from wordnet glosses. In 11th conference of the european chapter of the association for computational linguistics (pp. 209–216). Italy.

  • Bai, A., & Hammer, H. (2014). Constructing sentiment lexicons in Norwegian from a large text corpus. In 2014 IEEE 17th international conference on computational science and engineering

  • Brooke, J., Tofiloski, M., & Taboada, M. (2009b). Cross-Linguistic Sentiment Analysis: From English to Spanish. In international conference RANLP 2009, Borovets, Bulgaria, pp. 50–54.

  • Cambridge English Dictionary. (2017). http://dictionary.cambridge.org/.

  • Canuto, S., Gonçalves M. A., & Benevenuto, F. (2016). Exploiting new sentiment-based meta-level features for effective sentiment analysis. In Proceedings of the ninth ACM International conference on web search and data mining (WSDM ‘16), pp. 53–62. New York, USA.

  • Cha, S. -H. (2007). Comprehensive survey on distance/similarity measures between probability density. International Journal of Mathematical Models and Methods in Applied Sciences, 1(4), 300–307.

    MathSciNet  Google Scholar 

  • Chao, A., Chazdon, R. L., Colwell, R. K., & Shen, T. J. (2005). A new statistical approach for assessing similarity of species composition with incidence and abundance data, Ecology Letters, 8, 148–159. doi:10.1111/j.1461-0248.2004.00707.x

    Article  Google Scholar 

  • Chao, A., Chazdon, R. L., Colwell, R. K., & Shen, T. J. (2006). Abundance-based similarity indices and their estimation when there are unseen species in samples, Biometrics, 62, 361–371. doi:10.1111/j.1541-0420.2005.00489.x

    Article  MathSciNet  MATH  Google Scholar 

  • Choi, Y., & Cardie C. (2008). Learning with Compositional Semantics as Structural Inference for Subsentential Sentiment Analysis. In proceedings of the 2008 conference on empirical methods in natural language processing, pp. 793–801. Honolulu, October 2008.

  • Collins English Dictionary. (2017). http://www.collinsdictionary.com/dictionary/english.

  • Delmonte, R. A. (2008). A computational approach to implicit entities and events in text and discourse. International Journal of Speech Technology (IJST). doi:10.1007/s10772-009-9049-1.

    Google Scholar 

  • Du, W., Tan, S., Cheng, X., & Yun, X. (2010). Adapting information bottleneck method for automatic construction of domain-oriented sentiment lexicon. WSDM’10, New York, USA.

  • English Dictionary of Lingoes. (2017). http://www.lingoes.net/.

  • English Grammar of British Council. (2017). https://learnenglish.britishcouncil.org/en/english-grammar.

  • English Grammar of Cambridge. (2017). http://www.cambridge.org/us/cambridgeenglish/.

  • English Grammar of Oxford. (2017). http://www.oxfordonlineenglish.com/free-english-grammar-lessons.

  • English Grammar of Wikipedia. (2017). https://en.wikipedia.org/wiki/English_grammar.

  • Feng, S., Zhang, L., Li, B., Wang, D., Yu, G., & Wong, K. -F. (2013). Is twitter a better corpus for measuring sentiment similarity? In Proceedings of the 2013 conference on empirical methods in natural language processing, pp. 897–902. USA.

  • Godbole, N., Srinivasaiah, M., & Skiena, S. (2007). Large-Scale Sentiment Analysis for News and Blogs, ICWSM’2007 Boulder, Colorado, USA,

  • Htait, A., Fournier, S., & Bellot, P. (2016). LSIS at SemEval-2016 Task 7: Using web search engines for English and Arabic unsupervised sentiment intensity prediction. In Proceedings of SemEval-2016, pp. 481–485. California.

  • Ji, X., Chun, S. A.,Wei, Z., & Geller, J. (2015). Twitter sentiment classification for measuring public health concerns. Social Network Analysis and Mining, 5(13). doi:10.1007/s13278-015-0253-5.

  • Jiang, T., Jiang, J., Dai, Y., & Li, A. (2015). Micro-blog Emotion orientation analysis algorithm based on tibetan and chinese mixed text. In International symposium on social science (ISSS 2015).

  • Jovanoski, D., Pachovski, V., & Nakov, P. (2015). Sentiment Analysis in Twitter for Macedonian. In Proceedings of recent advances in natural language processing, pp. 249–257. Bulgaria.

  • Longman English Dictionary. (2017). http://www.ldoceonline.com/.

  • MacMillan English Dictionary. (2017). http://www.macmillandictionary.com/.

  • Manek, A. S., Shenoy, P. D., Mohan, M. C., & Venugopal, K. R. (2016). Aspect term extraction for sentiment analysis in large movie reviews using Gini Index feature selection method and SVM classifier. World Wide Web, 1–20. doi:10.1007/s11280-015-0381-x, Print ISSN1386-145X, US.

  • Mao, H., Gao, P., Wang, Y., & Bollen, J. (2014). Automatic construction of financial semantic orientation lexicon from large-scale Chinese news corpus. In 7th financial risks international forum, Institut Louis Bachelier, 20, 1–18

  • Malouf, R., & Mullen, T. (2017). Graph-based user classification for informal online political discourse. In Proceedings of the 1st workshop on information credibility on the web.

  • Molinero, M. A., Sagot, B., & Nicolas, L. (2009). A morphological and syntactic wide-coverage lexicon for Spanish: The Leffe. In Proceedings of international conference recent advaneces in natural language processing’2009, Bulgaria.

  • Nasukawa, T., & Yi, J. (2003). Sentiment analysis: capturing favorability using natural language processing. In: K-CAP ’03: Proceedings of the 2nd international conference on Knowledge capture (pp. 70–77). New York, USA: ACM

  • Nazlia, O., Albared, M., Al-Shabi, A. Q., & Al-Moslmi, T. (2013). Ensemble of classification algorithms for subjectivity and sentiment analysis of arabic customers’ reviews. International Journal of Advancements in Computing Technology (IJACT), 5(14), 77.

    Google Scholar 

  • Netzer, O., Feldman, R., Goldenberg, J., & Fresko, M. (2012). Mine your own business: Market-structure surveillance through text mining. Marketing Science, 31(3), 521–543.

    Article  Google Scholar 

  • Oxford English Dictionary. (2017). http://www.oxforddictionaries.com/.

  • Patro, H., Senthil Raja, G., & Dandapat, S. (2007). Statistical feature evaluation for classification of stressed speech. International Journal of Speech Technology (IJST), 10(2), 143–152. doi:10.1007/s10772-009-9021-0

    Article  Google Scholar 

  • Phu, V. N., & Tuoi, P. T. (2014). Sentiment classification using enhanced contextual valence shifters. In International Conference on Asian Language Processing (IALP) (pp. 224–229), Oct 2014.

  • Phu, V. N., Dat, N. D., Tran, V. T. N., Chau, V. T. N., & Nguyen, T. A. (2017a). Fuzzy C-means for english sentiment classification in a distributed system, International Journal of Applied Intelligence (APIN), 1–22. doi:10.1007/s10489-016-0858-z.

  • Phu, V. N., Chau, V. T. N., Tran, V. T. N., Dat, N. D., & Duy, K. L. D. (2017b). A C4.5 algorithm for english emotional classification. International Journal of Evolving Systems. doi:10.1007/s12530-017-9180-1.

    Google Scholar 

  • Phu, V. N., Chau, V. T. N., Tran, V. T. N., & Dat, N. D. (2017c). A Vietnamese adjective emotion dictionary based on exploitation of Vietnamese language characteristics. International Journal of Artificial Intelligence Review (AIR), 1–67. doi:10.1007/s10462-017-9538-6.

  • Phu, V. N., Chau, V. T. N., Tran, V. T. N., Dat, N. D., & Nguyen, T. A. (2017d). STING algorithm used english sentiment classification in a parallel environment. International Journal of Pattern Recognition and Artificial Intelligence, 31(7), 1750021. doi:10.1142/S0218001417500215.

    Article  Google Scholar 

  • Qiu, G., Liu, B., Bu, J., & Chen, C. (2009). Expanding domain sentiment lexicon through double propagation. In Proceedings of the 21st international joint conference on artificial intelligence, Pasadena, California, USA, July 11–17, 2009.

  • Remus, R., Quasthoff, U., & Heyer, G. (2010). SentiWS-A publicly available german-language resource for sentiment analysis. In Proceedings of the 7th international language ressources and evaluation (LREC’10), pp. 1168–1171.

  • Ren, Y., Kaji, N., Yoshinaga, N., Toyoda, M., & Kitsuregawa, M. (2011). Sentiment classification in resource-scarce languages by using label propagation. In Proceedings of the 25th Pacific Asia conference on language, information and computation, pp. 420–429. Tokyo: Institute of Digital Enhancement of Cognitive Processing, Waseda University.

  • Scheible, C. (2010). Sentiment translation through lexicon induction. In Proceedings of the ACL 2010 student research workshop, pp. 25–30. Sweden.

  • Sesli, M., & Yegenoglu, E. D. (2010). Comparison of similarity coefficients used for cluster analysis based on RAPD markers in wild olives. Genetics and Molecular Research, 9(4), 2248–2253.

    Article  Google Scholar 

  • Sharp, B., & Chibelushi, C. (2008). Text segmentation of spoken meeting transcripts. International Journal of Speech Technology (IJST). doi:10.1007/s10772-009-9048-2.

    Google Scholar 

  • Shikalgar, N. R., & Dixit, A. M. (2014). JIBCA: Jaccard Index based Clustering Algorithm for Mining Online Review. International Journal of Computer Applications (0975–8887), 105(15). doi:10.5120/18454-9735

  • Sorensen Measure. (2016). https://en.wikipedia.org/wiki/S%C3%B8rensen%E2%80%93Dice_coefficient.

  • Steinberger, J., Ebrahim, M., Ehrmann, M., Hurriyetoglu, A., Kabadjov, M., Lenkova, P., Steinberger, R., Tanev, H., Vázquez, S., & Zavarella, V. (2012). Creating sentiment dictionaries via triangulation. Decision Support Systems, 53(4), 689–694.

    Article  Google Scholar 

  • Taboada, M., Brooke, J., Tofiloski, M., Voll, K., & Stede, M. (2011). Lexicon-based methods for sentiment analysis. Computational Linguistics, 37(2), 267–307.

    Article  Google Scholar 

  • Tan, S., & Zhang, J. (2007). An empirical study of sentiment analysis for Chinese documents. Expert Systems with Applications. doi:10.1016/j.eswa.2007.05.028

    Google Scholar 

  • Tan, S., Wang, Y., & Cheng, X. (2008). Combining learn-based and lexicon-based techniques for sentiment detection without using labeled examples. In SIGIR ‘08 Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval, pp. 743–744. New York, USA.

  • Tran, V. T. N., Phu, V. N., & Tuoi, P. T. (2014). Learning more chi square feature selection to improve the fastest and most accurate sentiment classification. In The third Asian conference on information systems, ACIS.

  • Thongphak, D., & Kulsa, C. (2014). Diversity and community composition of ants in the mixed deciduous forest, the pine forest and the para rubber plantation at Chulaborn Dam, Chaiyaphum Province, the Northeastern Thailand. IJERD—International Journal of Environmental and Rural Development, 5–1, 2013.

    Google Scholar 

  • Turney, P. D., & Littman, M.L. (2002). Unsupervised learning of semantic orientation from a hundred-billion-word corpus. arXiv:cs/0212012, Learning (cs.LG); Information Retrieval (cs.IR), 2002.

  • Wan, X. (2009). Co-training for cross-lingual sentiment classification. In Proceedings of the 47th annual meeting of the ACL and the 4th IJCNLP of the AFNLP, pp. 235–243. Singapore.

  • Wang, G., & Araki, K. (2007). Modifying SO-PMI for Japanese weblog opinion mining by using a balancing factor and detecting neutral expressions. In Proceedings of NAACL HLT 2007, Companion Volume, pp. 189–192. NY, 2007.

  • Wolda, H. (1981). Similarity indices, sample size and diversity. OecoIogia (Berlin), 50, 296–302.

    Article  Google Scholar 

  • Yong, R., Nobuhiro, K., Yoshinaga, N., & Kitsuregawa, M. (2014). Sentiment classification in under-resourced languages using graph-based semi-supervised learning methods. IEICE Transactions on Information and Systems, E97–D(4). doi:10.1587/transinf.E97.D.1.

  • Zhang, Z., Ye, Q., Zheng, W., & Li, Y. (2010). Sentiment classification for consumer word-of-mouth in chinese: comparison between supervised and unsupervised approaches. The 2010 international conference on e-business intelligence, 2010. Atlantis Press

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Vo Ngoc Phu.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Phu, V.N., Chau, V.T.N. & Tran, V.T.N. Shifting semantic values of English phrases for classification. Int J Speech Technol 20, 509–533 (2017). https://doi.org/10.1007/s10772-017-9420-6

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10772-017-9420-6

Keywords

Navigation