Abstract
Sentiment Analysis has a fundamental role in analyzing users opinions in all kinds of textual sources. Computing accurately sentiment expressed in huge amount of textual data is a key task largely required by the market, and nowadays industrial engines make available ready-to-use APIs for sentiment analysis-related tasks. However, building sentiment engines showing high accuracy on structurally different textual sources (e.g. reviews, tweets, blogs, etc.) is not a trivial task. Papers about cross-source evaluation lack of a comparison with industrial engines, which are instead specifically designed for dealing with multiple sources.
In this paper, we compare the results of research and industrial engines on an extensive experimental evaluation, considering the document-level polarity detection task performed on different textual sources: tweets, apps reviews and general products reviews, in both English and Italian. The experimental evaluation results help the reader to quantify the performance gap between industrial and research sentiment engines when both are tested on heterogeneous textual sources and on different languages (English/Italian). Finally, we present the results of our multi-source solution X2Check. Considering an overall cross-source average F-score on all of the results, X2Check shows a performance that is 9.1% and 5.1% higher than Google CNL, respectively on Italian and English benchmarks. Compared to the research engines, X2Check shows a F-score that is always higher than tools not specifically trained on the test set under evaluation; it is lower at most of 3.4% in Italian and 11.6% on English benchmarks, compared to the best research tools specifically trained on the target source.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
References
Araújo, M., Gonçalves, P., Cha, M., Benevenuto, F.: iFeel: a system that compares and combines sentiment analysis methods. In: Proceedings of WWW 2014 Companion, pp. 75–78 (2014)
Araújo, M., dos Reis, J.C., Pereira, A.M., Benevenuto, F.: An evaluation of machine translation for multilingual sentence-level sentiment analysis. In: Proceedings of the 31st Annual ACM Symposium on Applied Computing, Pisa, Italy, 4–8 April 2016, pp. 1140–1145 (2016)
Barbieri, F., Basile, V., Croce, D., Nissim, M., Novielli, N., Patti, V.: Overview of the evalita 2016 sentiment polarity classification task. In: Proceedings of CLiC-it 2016 & EVALITA 2016 (2016)
Blitzer, J., Dredze, M., Pereira, F.: Biographies, bollywood, boom-boxes and blenders: domain adaptation for sentiment classification. In: Proceedings of ACL 2007 (2007)
Bollegala, D., Mu, T., Goulermas, J.Y.: Cross-domain sentiment classification using sentiment sensitive embeddings. IEEE Trans. Knowl. Data Eng. 28(2), 398–410 (2016)
Di Rosa, E., Durante, A.: App2check: a machine learning-based system for sentiment analysis of app reviews in Italian language. In: Proceedings of the International Workshop on Social Media World Sensors (Sideways)- Held in conjunction with LREC 2016, pp. 8–11 (2016)
Dragoni, M., Recupero, D.R.: Challenge on fine-grained sentiment analysis within ESWC2016. In: Sack, H., Dietze, S., Tordai, A., Lange, C. (eds.) Semantic Web Challenges - Third SemWebEval Challenge at ESWC 2016, vol. 641, pp. 79–94. Springer, Heidelberg (2016)
Heredia, B., Khoshgoftaar, T.M., Prusa, J.D., Crawford, M.: Cross-domain sentiment analysis: an empirical investigation. In: Proceedings of IRI 2016, pp. 160–165 (2016)
Heredia, B., Khoshgoftaar, T.M., Prusa, J.D., Crawford, M.: Integrating multiple data sources to enhance sentiment prediction. In: Proceedings of IEEE CIC 2016, pp. 285–291 (2016)
Li, F., Wang, S., Liu, S., Zhang, M.: SUIT: a supervised user-item based topic model for sentiment analysis. In: Proceedings of AAAI 2014, pp. 1636–1642 (2014)
Liu, B.: Sentiment Analysis and Opinion Mining. Morgan & Claypool Publishers, San Rafael (2012)
Mejova, Y., Srinivasan, P.: Crossing media streams with sentiment: domain adaptation in blogs, reviews and Twitter. In: Proceedings of ICWSM 2012 (2012)
Nakov, P., Ritter, A., Sara, R., Sebastiani, F., Stoyanov, V.: Semeval-2016 task 4: sentiment analysis in Twitter. In: Proceedings of SemEval 2016. Association for Computational Linguistics (2016)
Pan, S.J., Ni, X., Sun, J., Yang, Q., Chen, Z.: Cross-domain sentiment classification via spectral feature alignment. In: Proceedings of WWW 2010, pp. 751–760 (2010)
Pang, B., Lee, L.: Opinion mining and sentiment analysis. Found. Trends Inf. Retr. 2(1–2), 1–135 (2008)
Rosenthal, S., Farra, N., Nakov, P.: SemEval-2017 task 4: sentiment analysis in Twitter. In: Proceedings of SemEval 2017. Association for Computational Linguistics (2017)
Täckström, O., McDonald, R.: Discovering fine-grained sentiment with latent variable structured prediction models. In: Clough, P., Foley, C., Gurrin, C., Jones, G.J.F., Kraaij, W., Lee, H., Mudoch, V. (eds.) ECIR 2011. LNCS, vol. 6611, pp. 368–374. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-20161-5_37
Täckström, O., McDonald, R.T.: Semi-supervised latent variable models for sentence-level sentiment analysis. In: Proceedings of HLT 2011, pp. 569–574 (2011)
Thelwall, M., Buckley, K., Paltoglou, G., Cai, D., Kappas, A.: Sentiment strength detection in short informal text. JASIST 61(12), 2544–2558 (2010)
Wilson, T., Wiebe, J., Hoffmann, P.: Recognizing contextual polarity: an exploration of features for phrase-level sentiment analysis. Comput. Linguist. 35(3), 399–433 (2009)
Wu, F., Huang, Y.: Sentiment domain adaptation with multiple sources. In: Proceedings of ACL 2016 (2016)
Wu, F., Huang, Y., Yuan, Z.: Domain-specific sentiment classification via fusing sentiment knowledge from multiple sources. Inf. Fusion 35, 26–37 (2017)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Di Rosa, E., Durante, A. (2017). Evaluating Industrial and Research Sentiment Analysis Engines on Multiple Sources. In: Esposito, F., Basili, R., Ferilli, S., Lisi, F. (eds) AI*IA 2017 Advances in Artificial Intelligence. AI*IA 2017. Lecture Notes in Computer Science(), vol 10640. Springer, Cham. https://doi.org/10.1007/978-3-319-70169-1_11
Download citation
DOI: https://doi.org/10.1007/978-3-319-70169-1_11
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-70168-4
Online ISBN: 978-3-319-70169-1
eBook Packages: Computer ScienceComputer Science (R0)