Abstract
Many different attempts have been made to determine sentiment polarity in tweets, using emotion lexicons and different NLP techniques with machine learning. In this paper we focus on using emotion lexicons and machine learning only, avoiding the use of additional NLP techniques. We present a scheme that is able to outperform other systems that use both natural language processing and distributional semantics. Our proposal consists on using a cascading classifier on lexicon features to improve accuracy. We evaluate our results with the TASS 2015 corpus, reaching an accuracy only 0.07 below the top-ranked system for task 1, 3 levels, whole test corpus. The cascading method we implemented consisted on using the results of a first stage classification with Multinomial Naïve Bayes as additional columns for a second stage classification using a Naïve Bayes Tree classifier with feature selection. We tested with at least 30 different classifiers and this combination yielded the best results.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
Our vectors can be downloaded at http://likufanele.com/twitterSEL as ARFF files.
References
Pang, B., Lee, L.: Opinion mining and sentiment analysis. Found. Trends Inf. Retr. 2, 1–135 (2008)
Taboada, M., Brooke, J., Tofiloski, M., Voll, K.D., Stede, M.: Lexicon-based methods for sentiment analysis. Comput. Linguist. 37, 267–307 (2011)
Baccianella, S., Esuli, A., Sebastiani, F.: SentiWordNet 3.0: an enhanced lexical resource for sentiment analysis and opinion mining. In: Calzolari, N., Choukri, K., Maegaard, B., Mariani, J., Odijk, J., Piperidis, S., Rosner, M., Tapias, D. (eds.) LREC. European Language Resources Association, Paris (2010)
Wilson, T., Hoffmann, P., Somasundaran, S., Kessler, J., Wiebe, J., Choi, Y., Cardie, C., Riloff, E., Patwardhan, S.: OpinionFinder: a system for subjectivity analysis. In: Proceedings of HLT/EMNLP on interactive demonstrations, pp. 34–35. Association for Computational Linguistics (2005)
Stone, P.J.: The General Inquirer: A Computer Approach to Content Analysis. User’s Manual. MIT Press, Cambridge (1968)
Pennebaker, J.W., Francis, M.E., Booth, R.J.: Linguistic Inquiry and Word Count: LIWC 20001, vol. 71. Lawrence Erlbaum Associates, Mahway (2001)
Rangel, I.D., Guerra, S.S., Sidorov, G.: Creación y evaluación de un diccionario marcado con emociones y ponderado para el español. Onomazein 29, 31–46 (2014)
Villena Román, J., Lana Serrano, S., Martínez Cámara, E., González Cristóbal, J.C.: TASS-workshop on sentiment analysis at SEPLN (2013)
Villena-Román, J., García Morera, J., García-Cumbreras, M.Á., Martínez-Cámara, E., Martín-Valdivia, M.T., Ureña López, L.A.: Overview of TASS 2015. In: TASS 2015: Workshop on Sentiment Analysis at SEPLN, vol. 1397. CEUR-WS.org (2015)
Robertson, S.E., Walker, S., Jones, S., Hancock-Beaulieu, M.M., Gatford, M., et al.: Okapi at TREC-3, P. 109. NIST Special Publication (1995)
Saralegi, X., San Vicente, I.: Elhuyar at TASS 2013. In: XXIX Congreso de la Sociedad Espaola de Procesamiento de lenguaje natural, Workshop on Sentiment Analysis at SEPLN (TASS 2013), pp. 143–150 (2013)
Taboada, M., Brooke, J., Tofiloski, M., Voll, K., Stede, M.: Lexicon-based methods for sentiment analysis. Comput. Linguist. 37, 267–307 (2011)
Martınez-Cámara, E., Martın-Valdivia, M., Molina-González, M., Urena-López, L.: Bilingual experiments on an opinion comparable corpus. In: WASSA 2013, p. 87 (2013)
Molina-González, M.D., Martínez-Cámara, E., Martín-Valdivia, M.T., Perea-Ortega, J.M.: Semantic orientation for polarity classification in Spanish reviews. Expert Syst. Appl. 40, 7250–7257 (2013)
Perez-Rosas, V., Banea, C., Mihalcea, R.: Learning sentiment lexicons in Spanish. In: LREC, vol. 12, p. 73(2012)
Rıos, M.G.D., Gravano, A.: Spanish DAL: a Spanish dictionary of affect in language. In: WASSA 2013, p. 21 (2013)
Redondo, J., Fraga, I., Padrón, I., Comesaña, M.: The Spanish adaptation of ANEW. Behav. Res. Methods 39, 600–605 (2007)
Vilares, D., Doval, Y., Alonso, M.A., Gómez-Rodrıguez, C.: LyS at TASS 2014: a prototype for extracting and analysing aspects from Spanish tweets. In: Proceedings of the TASS workshop at SEPLN (2014)
Cruz, F.L., Troyano, J.A., Pontes, B., Ortega, F.J.: ML-SentiCon: un lexicón multilingüe de polaridades semánticas a nivel de lemas. Procesamiento del Leng. Nat. 53, 113–120 (2014)
Esuli, A., Sebastiani, F.: SENTIWORDNET: a publicly available lexical resource for opinion mining. In: Proceedings of LREC, vol. 6, pp. 417–422 (2006)
Agerri, R., García-Serrano, A.: Q-WordNet: extracting polarity from WordNet senses. In: LREC (2010)
Manandhar, S., Yuret, D.: Second joint conference on lexical and computational semantics (*SEM). In: Proceedings of the Seventh International Workshop on Semantic Evaluation (SemEval 2013), vol. 2 (2013)
Deng, L., Wiebe, J.: MPQA 3.0: an entity/event-level sentiment corpus. In: Conference of the North American Chapter of the Association of Computational Linguistics: Human Language Technologies (2015)
Stone, P.J., Dunphy, D.C., Smith, M.S.: The general inquirer: a computer approach to content analysis (1966)
Pennington, J., Socher, R., Manning, C.D.: GloVe: global vectors for word representation. In: EMNLP, vol. 14, pp. 1532–1543 (2014)
Collins, M., Schapire, R.E., Singer, Y.: Logistic regression, Adaboost and Bregman distances. Mach. Learn. 48, 253–285 (2002)
Martínez-Cámara, E., García-Cumbreras, M., Martín-Valdivia, M.T., Ureña López, L.A.: SINAI-EMMA: vectores de palabras para el análisis de opiniones en Twitter. In: TASS 2015: Workshop on Sentiment Analysis at SEPLN, vol. 1397. CEUR-WS.org (2015)
del Pilar Salas-Zárate, M., López-López, E., Valencia-García, R., Aussenac-Gilles, N., Almela, Á., Alor-Hernández, G.: A study on LIWC categories for opinion mining in spanish reviews. J. Inf. Sci. 40, 749–760 (2014)
Serendero, P., Toro, M.: Attribute selection for classification. In: Proceedings International Conference e-Society (IADIS)’, Lisbon, Portugal (2003)
Hall, M.A.: Correlation-based feature selection for machine learning. PhD thesis, The University of Waikato (1999)
Pang, B., Lee, L., Vaithyanathan, S.: Thumbs up? Sentiment classification using machine learning techniques. In: Proceedings of EMNLP, pp. 79–86 (2002)
Dong, L., Frank, E., Kramer, S.: Ensembles of balanced nested dichotomies for multi-class problems. In: Jorge, A.M., Torgo, L., Brazdil, P., Camacho, R., Gama, J. (eds.) PKDD 2005. LNCS (LNAI), vol. 3721, pp. 84–95. Springer, Heidelberg (2005). https://doi.org/10.1007/11564126_13
Acknowledgments
We thank the support of Instituto Politécnico Nacional (IPN), ESCOM-IPN, SIP-IPN projects number 20160815, 20162058, COFAA-IPN, and EDI-IPN.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG, part of Springer Nature
About this paper
Cite this paper
Calvo, H., Juárez Gambino, O. (2018). Cascading Classifiers for Twitter Sentiment Analysis with Emotion Lexicons. In: Gelbukh, A. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2016. Lecture Notes in Computer Science(), vol 9624. Springer, Cham. https://doi.org/10.1007/978-3-319-75487-1_21
Download citation
DOI: https://doi.org/10.1007/978-3-319-75487-1_21
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-75486-4
Online ISBN: 978-3-319-75487-1
eBook Packages: Computer ScienceComputer Science (R0)