Skip to main content

On the Limitations of Combining Sentiment Analysis Tools in a Cross-Platform Setting

  • Conference paper
  • First Online:
Product-Focused Software Process Improvement (PROFES 2022)

Abstract

A positive working climate is essential in modern software development. It enhances productivity since a satisfied developer tends to deliver better results. Sentiment analysis tools are a means to analyze and classify textual communication between developers according to the polarity of the statements. Most of these tools deliver promising results when used with test data from the domain they are developed for (e.g., GitHub). But the tools’ outcomes lack reliability when used in a different domain (e.g., Stack Overflow). One possible way to mitigate this problem is to combine different tools trained in different domains. In this paper, we analyze a combination of three sentiment analysis tools in a voting classifier according to their reliability and performance. The tools are trained and evaluated using five already existing polarity data sets (e.g. from GitHub). The results indicate that this kind of combination of tools is a good choice in the within-platform setting. However, a majority vote does not necessarily lead to better results when applying in cross-platform domains. In most cases, the best individual tool in the ensemble is preferable. This is mainly due to the often large difference in performance of the individual tools, even on the same data set. However, this may also be due to the different annotated data sets.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Cabrera-Diego, L.A., Bessis, N., Korkontzelos, I.: Classifying emotions in stack overflow and JIRA using a multi-label approach. Knowl. Based Syst. 195, 105633 (2020). https://doi.org/10.1016/j.knosys.2020.105633

    Article  Google Scholar 

  2. Calefato, F., Lanubile, F., Maiorano, F., Novielli, N.: Sentiment polarity detection for software development. Empir. Softw. Eng. 23(3), 1352–1382 (2017). https://doi.org/10.1007/s10664-017-9546-9

    Article  Google Scholar 

  3. Fleiss, J.L.: Measuring nominal scale agreement among many raters. Psycholog. Bull. 76(5), 378–382 (1971). https://doi.org/10.1037/h0031619

  4. Gachechiladze, D., Lanubile, F., Novielli, N., Serebrenik, A.: Anger and its direction in collaborative software development. In: Proceedings of the 39th International Conference on Software Engineering: New Ideas and Emerging Results Track, ICSE-NIER 2017, pp. 11–14. IEEE Press (2017). https://doi.org/10.1109/ICSE-NIER.2017.18

  5. Graziotin, D., Wang, X., Abrahamsson, P.: Do feelings matter? On the correlation of affects and the self-assessed productivity in software engineering. J. Softw. Evol. Process 27(7), 467–487 (2015). https://doi.org/10.1002/smr.1673

    Article  Google Scholar 

  6. Herrmann, M., Klünder, J.: From textual to verbal communication: towards applying sentiment analysis to a software project meeting. In: 2021 IEEE 29th International Requirements Engineering Conference Workshops (REW), pp. 371–376 (2021). https://doi.org/10.1109/REW53955.2021.00065

  7. Herrmann, M., Obaidi, M., Chazette, L., Klünder, J.: On the subjectivity of emotions in software projects: how reliable are pre-labeled data sets for sentiment analysis? J. Syst. Softw. 193, 111448 (2022). https://doi.org/10.1016/j.jss.2022.111448

    Article  Google Scholar 

  8. Landis, J.R., Koch, G.G.: The measurement of observer agreement for categorical data. Biometrics 33(1), 159–174 (1977). https://doi.org/10.2307/2529310

    Article  MATH  Google Scholar 

  9. Lin, B., Cassee, N., Serebrenik, A., Bavota, G., Novielli, N., Lanza, M.: Opinion mining for software development: a systematic literature review. ACM Trans. Softw. Eng. Methodol. 31(3) (2022). https://doi.org/10.1145/3490388

  10. Lin, B., Zampetti, F., Bavota, G., Di Penta, M., Lanza, M., Oliveto, R.: Sentiment analysis for software engineering: how far can we go? In: Proceedings of the 40th International Conference on Software Engineering, ICSE 2018, pp. 94–104. Association for Computing Machinery, New York (2018). https://doi.org/10.1145/3180155.3180195

  11. Liu, Y., et al.: RoBERTa: a robustly optimized BERT pretraining approach (2019). https://doi.org/10.48550/ARXIV.1907.11692

  12. Manning, C.D., Raghavan, P., Schütze, H.: Introduction to Information Retrieval, reprinted Cambridge University Press, Cambridge (2009)

    MATH  Google Scholar 

  13. Novielli, N., Calefato, F., Dongiovanni, D., Girardi, D., Lanubile, F.: Can we use se-specific sentiment analysis tools in a cross-platform setting? In: Proceedings of the 17th International Conference on Mining Software Repositories, MSR 20220. Association for Computing Machinery, New York (2020). https://doi.org/10.1145/3379597.3387446

  14. Novielli, N., Calefato, F., Dongiovanni, D., Girardi, D., Lanubile, F.: A gold standard for polarity of emotions of software developers in GitHub (2020). https://doi.org/10.6084/m9.figshare.11604597.v1

  15. Novielli, N., Calefato, F., Lanubile, F.: A gold standard for emotion annotation in stack overflow. In: 2018 IEEE/ACM 15th International Conference on Mining Software Repositories (MSR), MSR 2018, pp. 14–17. Association for Computing Machinery, New York (2018). https://doi.org/10.1145/3196398.3196453

  16. Novielli, N., Calefato, F., Lanubile, F., Serebrenik, A.: Assessment of off-the-shelf SE-specific sentiment analysis tools: an extended replication study. Empir. Softw. Eng. 26(4), 1–29 (2021). https://doi.org/10.1007/s10664-021-09960-w

    Article  Google Scholar 

  17. Novielli, N., Girardi, D., Lanubile, F.: A benchmark study on sentiment analysis for software engineering research. In: Proceedings of the 15th International Conference on Mining Software Repositories, MSR 2018, pp. 364–375. Association for Computing Machinery, New York (2018). https://doi.org/10.1145/3196398.3196403

  18. Obaidi, M., Nagel, L., Specht, A., Klünder, J.: Sentiment analysis tools in software engineering: a systematic mapping study. Inf. Softw. Technol. 151, 107018 (2022). https://doi.org/10.1016/j.infsof.2022.107018

    Article  Google Scholar 

  19. Ortu, M., et al.: The emotional side of software developers in JIRA. In: Proceedings of the 13th International Conference on Mining Software Repositories, MSR 2016, pp. 480–483. Association for Computing Machinery, New York (2016). https://doi.org/10.1145/2901739.2903505

  20. Parrott, W.G.: Emotions in Social Psychology: Essential Readings. Psychology Press (2001)

    Google Scholar 

  21. Schneider, K., Klünder, J., Kortum, F., Handke, L., Straube, J., Kauffeld, S.: Positive affect through interactions in meetings: the role of proactive and supportive statements. J. Syst. Softw. 143, 59–70 (2018). https://doi.org/10.1016/j.jss.2018.05.001

    Article  Google Scholar 

  22. Uddin, G., Guéhénuc, Y.G., Khomh, F., Roy, C.K.: An empirical study of the effectiveness of an ensemble of stand-alone sentiment detection tools for software engineering datasets. ACM Trans. Softw. Eng. Methodol. 31(3) (2022). https://doi.org/10.1145/3491211

  23. Uddin, G., Khomh, F.: Automatic mining of opinions expressed about APIS in stack overflow. IEEE Trans. Software Eng. 47(3), 522–559 (2021). https://doi.org/10.1109/TSE.2019.2900245

    Article  Google Scholar 

  24. Wohlin, C., Runeson, P., Höst, M., Ohlsson, M.C., Regnell, B., Wesslén, A.: Experimentation in Software Engineering. Springer, Berlin (2012). https://doi.org/10.1007/978-3-642-29044-2

  25. Zhang, T., Xu, B., Thung, F., Haryono, S.A., Lo, D., Jiang, L.: Sentiment analysis for software engineering: How far can pre-trained transformer models go? In: 2020 IEEE International Conference on Software Maintenance and Evolution (ICSME), pp. 70–80 (2020). https://doi.org/10.1109/ICSME46990.2020.00017

Download references

Acknowledgment

This research was funded by the Leibniz University Hannover as a Leibniz Young Investigator Grant (Project ComContA, Project Number 85430128, 2020–2022).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Martin Obaidi .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Obaidi, M., Holm, H., Schneider, K., Klünder, J. (2022). On the Limitations of Combining Sentiment Analysis Tools in a Cross-Platform Setting. In: Taibi, D., Kuhrmann, M., Mikkonen, T., Klünder, J., Abrahamsson, P. (eds) Product-Focused Software Process Improvement. PROFES 2022. Lecture Notes in Computer Science, vol 13709. Springer, Cham. https://doi.org/10.1007/978-3-031-21388-5_8

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-21388-5_8

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-21387-8

  • Online ISBN: 978-3-031-21388-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics