Abstract
In this paper, we compare lexicon-based and machine learning-based approaches to define the subjectivity of tweets in Portuguese. We tested SentiLex and WordAffectBR lexicons, and Sequential Machine Optimization and Naive Bayes algorithms for this task. In our study, we used the Computer-BR corpus that contains messages about the technology area. We obtained better results using the Comprehensive Measurement Feature Selection method and the Sequential Machine Optimization algorithm as the classifier. We achieved considerable accuracy when we included the polarities of words in the vector space model of tweets.
Keywords
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsNotes
- 1.
- 2.
This heuristic is a small variation of the strategy proposed in [9].
- 3.
References
Feldman, R.: Techniques and applications for sentiment analysis. Commun. ACM 56(4), 82–89 (2013)
Dale, R., Moisl, H., Somers, H. (eds.): Handbook of Natural Language Processing. CRC Press, Boca Raton (2000)
Kamal, A.: Subjectivity Classification using Machine Learning Techniques for Mining Feature-Opinion Pairs from Web Opinion Sources (2013). arXiv preprint arXiv:1312.6962
Fersini, E., Messina, E., Pozzi, F.A.: Subjectivity, polarity and irony detection: a multi-layer approach. In: Proceedings of the First Italian Conference on Computational Linguistics CLiC-it 2014 & the Fourth International Workshop EVALITA (2014)
Drury, B., de Andrade Lopes, A.: A comparison of the effect of feature selection and balancing strategies upon the sentiment classification of Portuguese news stories. In: Proceedings of ENIAC (2014)
Santos, A.P., Ramos, C., Marques, N.C.: Sentiment classification of Portuguese news headlines. Int. J. Softw. Eng. Appl. 9(9), 9–18 (2015)
Rosa, R.L., Rodríguez, D.Z., Bressan, G.: SentiMeter-Br: a social web analysis tool to discover consumers’ sentiment. In: 2013 IEEE 14th International Conference on Mobile Data Management (MDM), vol. 2, pp. 122–124. IEEE (2013)
Morgado, I.C.: Classification of sentiment polarity of Portuguese on-line news. In: Proceedings of the 7th Doctoral Symposium in Informatics Engineering, pp. 139–150 (2012)
Filho, P.P.B., Pardo, T.A., Aluısio, S.M.: An evaluation of the Brazilian Portuguese liwc dictionary for sentiment analysis. In: 9th Brazilian Symposium in Information and Human Language Technology, Fortaleza, Ceara (2013)
Medhat, W., Hassan, A., Korashy, H.: Sentiment analysis algorithms and applications: a survey. Ain Shams Eng. J. 5(4), 1093–1113 (2014)
Carvalho, P., Silva, M.J.: SentiLex-PT: principais características e potencialidades. Oslo Stud. Lang. 7(1), 425–438 (2015)
Pasqualotti, P.R., Vieira, R.: WordnetAffectBR: uma base lexical de palavras de emoções para a língua Portuguesa. RENOTE 6, 1–10 (2008)
Généreux, M., Martinez, W.: Contrasting objective and subjective Portuguese texts from heterogeneous sources. In: Proceedings of the Workshop on Innovative Hybrid Approaches to the Processing of Textual Data, pp. 46–51. Association for Computational Linguistics (2012)
Moraes, S., Silveira, M., Manssour, I.: 7x1-PT: um Corpus extraído do Twitter para Análise de Sentimentos em Língua Portuguesa. BRACIS, STIL (2015)
Yang, J., Liu, Y., Zhu, X., Liu, Z., Zhang, X.: A new feature selection based on comprehensive measurement both in inter-category and intra-category for text categorization. Inf. Process. Manage. 48(4), 741–754 (2012)
Yang, J., Qu, Z., Liu, Z.: Improved feature-selection method considering the imbalance problem in text categorization. Sci. World J. (2014)
Souza, M., Vieira, R.: Sentiment analysis on twitter data for Portuguese language. In: Caseli, H., Villavicencio, A., Teixeira, A., Perdigão, F. (eds.) PROPOR 2012. LNCS, vol. 7243, pp. 241–247. Springer, Heidelberg (2012)
Lambov, D., Dias, G., Noncheva, V.: High-level features for learning subjective language across domains. In: Proceedings of International AAAI Conference on Weblogs and Social Media ICWSM (2009)
Acknowledgments
Our thanks to Dell for the financial support of this work.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
Moraes, S.M.W., Santos, A.L.L., Redecker, M., Machado, R.M., Meneguzzi, F.R. (2016). Comparing Approaches to Subjectivity Classification: A Study on Portuguese Tweets. In: Silva, J., Ribeiro, R., Quaresma, P., Adami, A., Branco, A. (eds) Computational Processing of the Portuguese Language. PROPOR 2016. Lecture Notes in Computer Science(), vol 9727. Springer, Cham. https://doi.org/10.1007/978-3-319-41552-9_8
Download citation
DOI: https://doi.org/10.1007/978-3-319-41552-9_8
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-41551-2
Online ISBN: 978-3-319-41552-9
eBook Packages: Computer ScienceComputer Science (R0)