Evaluating Industrial and Research Sentiment Analysis Engines on Multiple Sources

Di Rosa, Emanuele; Durante, Alberto

doi:10.1007/978-3-319-70169-1_11

Emanuele Di Rosa¹⁷ &
Alberto Durante¹⁸

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 10640))

Included in the following conference series:

Conference of the Italian Association for Artificial Intelligence

1362 Accesses
1 Citations

Abstract

Sentiment Analysis has a fundamental role in analyzing users opinions in all kinds of textual sources. Computing accurately sentiment expressed in huge amount of textual data is a key task largely required by the market, and nowadays industrial engines make available ready-to-use APIs for sentiment analysis-related tasks. However, building sentiment engines showing high accuracy on structurally different textual sources (e.g. reviews, tweets, blogs, etc.) is not a trivial task. Papers about cross-source evaluation lack of a comparison with industrial engines, which are instead specifically designed for dealing with multiple sources.

In this paper, we compare the results of research and industrial engines on an extensive experimental evaluation, considering the document-level polarity detection task performed on different textual sources: tweets, apps reviews and general products reviews, in both English and Italian. The experimental evaluation results help the reader to quantify the performance gap between industrial and research sentiment engines when both are tested on heterogeneous textual sources and on different languages (English/Italian). Finally, we present the results of our multi-source solution X2Check. Considering an overall cross-source average F-score on all of the results, X2Check shows a performance that is 9.1% and 5.1% higher than Google CNL, respectively on Italian and English benchmarks. Compared to the research engines, X2Check shows a F-score that is always higher than tools not specifically trained on the test set under evaluation; it is lower at most of 3.4% in Italian and 11.6% on English benchmarks, compared to the best research tools specifically trained on the target source.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
https://app2check.com/performance.

References

Araújo, M., Gonçalves, P., Cha, M., Benevenuto, F.: iFeel: a system that compares and combines sentiment analysis methods. In: Proceedings of WWW 2014 Companion, pp. 75–78 (2014)
Google Scholar
Araújo, M., dos Reis, J.C., Pereira, A.M., Benevenuto, F.: An evaluation of machine translation for multilingual sentence-level sentiment analysis. In: Proceedings of the 31st Annual ACM Symposium on Applied Computing, Pisa, Italy, 4–8 April 2016, pp. 1140–1145 (2016)
Google Scholar
Barbieri, F., Basile, V., Croce, D., Nissim, M., Novielli, N., Patti, V.: Overview of the evalita 2016 sentiment polarity classification task. In: Proceedings of CLiC-it 2016 & EVALITA 2016 (2016)
Google Scholar
Blitzer, J., Dredze, M., Pereira, F.: Biographies, bollywood, boom-boxes and blenders: domain adaptation for sentiment classification. In: Proceedings of ACL 2007 (2007)
Google Scholar
Bollegala, D., Mu, T., Goulermas, J.Y.: Cross-domain sentiment classification using sentiment sensitive embeddings. IEEE Trans. Knowl. Data Eng. 28(2), 398–410 (2016)
Article Google Scholar
Di Rosa, E., Durante, A.: App2check: a machine learning-based system for sentiment analysis of app reviews in Italian language. In: Proceedings of the International Workshop on Social Media World Sensors (Sideways)- Held in conjunction with LREC 2016, pp. 8–11 (2016)
Google Scholar
Dragoni, M., Recupero, D.R.: Challenge on fine-grained sentiment analysis within ESWC2016. In: Sack, H., Dietze, S., Tordai, A., Lange, C. (eds.) Semantic Web Challenges - Third SemWebEval Challenge at ESWC 2016, vol. 641, pp. 79–94. Springer, Heidelberg (2016)
Google Scholar
Heredia, B., Khoshgoftaar, T.M., Prusa, J.D., Crawford, M.: Cross-domain sentiment analysis: an empirical investigation. In: Proceedings of IRI 2016, pp. 160–165 (2016)
Google Scholar
Heredia, B., Khoshgoftaar, T.M., Prusa, J.D., Crawford, M.: Integrating multiple data sources to enhance sentiment prediction. In: Proceedings of IEEE CIC 2016, pp. 285–291 (2016)
Google Scholar
Li, F., Wang, S., Liu, S., Zhang, M.: SUIT: a supervised user-item based topic model for sentiment analysis. In: Proceedings of AAAI 2014, pp. 1636–1642 (2014)
Google Scholar
Liu, B.: Sentiment Analysis and Opinion Mining. Morgan & Claypool Publishers, San Rafael (2012)
Google Scholar
Mejova, Y., Srinivasan, P.: Crossing media streams with sentiment: domain adaptation in blogs, reviews and Twitter. In: Proceedings of ICWSM 2012 (2012)
Google Scholar
Nakov, P., Ritter, A., Sara, R., Sebastiani, F., Stoyanov, V.: Semeval-2016 task 4: sentiment analysis in Twitter. In: Proceedings of SemEval 2016. Association for Computational Linguistics (2016)
Google Scholar
Pan, S.J., Ni, X., Sun, J., Yang, Q., Chen, Z.: Cross-domain sentiment classification via spectral feature alignment. In: Proceedings of WWW 2010, pp. 751–760 (2010)
Google Scholar
Pang, B., Lee, L.: Opinion mining and sentiment analysis. Found. Trends Inf. Retr. 2(1–2), 1–135 (2008)
Article Google Scholar
Rosenthal, S., Farra, N., Nakov, P.: SemEval-2017 task 4: sentiment analysis in Twitter. In: Proceedings of SemEval 2017. Association for Computational Linguistics (2017)
Google Scholar
Täckström, O., McDonald, R.: Discovering fine-grained sentiment with latent variable structured prediction models. In: Clough, P., Foley, C., Gurrin, C., Jones, G.J.F., Kraaij, W., Lee, H., Mudoch, V. (eds.) ECIR 2011. LNCS, vol. 6611, pp. 368–374. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-20161-5_37
Chapter Google Scholar
Täckström, O., McDonald, R.T.: Semi-supervised latent variable models for sentence-level sentiment analysis. In: Proceedings of HLT 2011, pp. 569–574 (2011)
Google Scholar
Thelwall, M., Buckley, K., Paltoglou, G., Cai, D., Kappas, A.: Sentiment strength detection in short informal text. JASIST 61(12), 2544–2558 (2010)
Article Google Scholar
Wilson, T., Wiebe, J., Hoffmann, P.: Recognizing contextual polarity: an exploration of features for phrase-level sentiment analysis. Comput. Linguist. 35(3), 399–433 (2009)
Article Google Scholar
Wu, F., Huang, Y.: Sentiment domain adaptation with multiple sources. In: Proceedings of ACL 2016 (2016)
Google Scholar
Wu, F., Huang, Y., Yuan, Z.: Domain-specific sentiment classification via fusing sentiment knowledge from multiple sources. Inf. Fusion 35, 26–37 (2017)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Head of Artificial Intelligence at Finsa s.p.a., Genoa, Italy
Emanuele Di Rosa
Research Scientist at Finsa s.p.a., Genoa, Italy
Alberto Durante

Authors

Emanuele Di Rosa
View author publications
You can also search for this author in PubMed Google Scholar
Alberto Durante
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Emanuele Di Rosa .

Editor information

Editors and Affiliations

University of Bari, Bari, Italy
Floriana Esposito
University of Rome Tor Vergata, Rome, Italy
Roberto Basili
University of Bari, Bari, Italy
Stefano Ferilli
University of Bari, Bari, Italy
Francesca A. Lisi

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Di Rosa, E., Durante, A. (2017). Evaluating Industrial and Research Sentiment Analysis Engines on Multiple Sources. In: Esposito, F., Basili, R., Ferilli, S., Lisi, F. (eds) AI*IA 2017 Advances in Artificial Intelligence. AI*IA 2017. Lecture Notes in Computer Science(), vol 10640. Springer, Cham. https://doi.org/10.1007/978-3-319-70169-1_11

Download citation

DOI: https://doi.org/10.1007/978-3-319-70169-1_11
Published: 07 November 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-70168-4
Online ISBN: 978-3-319-70169-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics