skip to main content
10.1145/3551349.3559570acmotherconferencesArticle/Chapter ViewAbstractPublication PagesaseConference Proceedingsconference-collections
research-article

Identification and Mitigation of Toxic Communications Among Open Source Software Developers

Published:05 January 2023Publication History

ABSTRACT

Toxic and unhealthy conversations during the developer’s communication may reduce the professional harmony and productivity of Free and Open Source Software (FOSS) projects. For example, toxic code review comments may raise pushback from an author to complete suggested changes. A toxic communication with another person may hamper future communication and collaboration. Research also suggests that toxicity disproportionately impacts newcomers, women, and other participants from marginalized groups. Therefore, toxicity is a barrier to promote diversity, equity, and inclusion. Since the occurrence of toxic communications is not uncommon among FOSS communities and such communications may have serious repercussions, the primary objective of my proposed dissertation is to automatically identify and mitigate toxicity during developers’ textual interactions. On this goal, I aim to: i) build an automated toxicity detector for Software Engineering (SE) domain, ii) identify the notion of toxicity across demographics, and iii) analyze the impacts of toxicity on the outcomes of Open Source Software (OSS) projects.

References

  1. Toufique Ahmed, Amiangshu Bosu, Anindya Iqbal, and Shahram Rahimi. 2017. SentiCR: a customized sentiment analysis tool for code review interactions. In 2017 32nd IEEE/ACM International Conference on Automated Software Engineering (ASE). IEEE, 106–111.Google ScholarGoogle ScholarCross RefCross Ref
  2. Conversation AI. [n.d.]. What if technology could help improve conversations online?https://www.perspectiveapi.com/Google ScholarGoogle Scholar
  3. Anonymous. 2014. Leaving Toxic Open Source Communities. https://modelviewculture.com/pieces/leaving-toxic-open-source-communitiesGoogle ScholarGoogle Scholar
  4. Luke Breitfeller, Emily Ahn, David Jurgens, and Yulia Tsvetkov. 2019. Finding microaggressions in the wild: A case for locating elusive phenomena in social media posts. In Proceedings of the 2019 conference on empirical methods in natural language processing and the 9th international joint conference on natural language processing (EMNLP-IJCNLP). 1664–1674.Google ScholarGoogle ScholarCross RefCross Ref
  5. Fabio Calefato, Filippo Lanubile, Federico Maiorano, and Nicole Novielli. 2018. Sentiment polarity detection for software development. Empirical Software Engineering 23, 3 (2018), 1352–1382.Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Jithin Cheriyan, Bastin Tony Roy Savarimuthu, and Stephen Cranefield. 2021. Towards offensive language detection and reduction in four Software Engineering communities. In Evaluation and Assessment in Software Engineering. 254–259.Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Cristian Danescu-Niculescu-Mizil, Moritz Sudhof, Dan Jurafsky, Jure Leskovec, and Christopher Potts. 2013. A computational approach to politeness with application to social factors. arXiv preprint arXiv:1306.6078(2013).Google ScholarGoogle Scholar
  8. R Van Wendel De Joode. 2004. Managing conflicts in open source communities. Electronic Markets 14, 2 (2004), 104–113.Google ScholarGoogle ScholarCross RefCross Ref
  9. Fabio Del Vigna12, Andrea Cimino23, Felice Dell’Orletta, Marinella Petrocchi, and Maurizio Tesconi. 2017. Hate me, hate me not: Hate speech detection on facebook. In Proceedings of the First Italian Conference on Cybersecurity (ITASEC17). 86–95.Google ScholarGoogle Scholar
  10. Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. (June 2019), 4171–4186. https://doi.org/10.18653/v1/N19-1423Google ScholarGoogle Scholar
  11. Carolyn D Egelman, Emerson Murphy-Hill, Elizabeth Kammer, Margaret Morrow Hodges, Collin Green, Ciera Jaspan, and James Lin. 2020. Predicting developers’ negative feelings about code review. In 2020 IEEE/ACM 42nd International Conference on Software Engineering (ICSE). IEEE, 174–185.Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Ikram El Asri, Noureddine Kerzazi, Gias Uddin, Foutse Khomh, and MA Janati Idrissi. 2019. An empirical study of sentiments in code reviews. Information and Software Technology 114 (2019), 37–54.Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Nelly Elsayed, Anthony S Maida, and Magdy Bayoumi. 2019. Deep Gated Recurrent and Convolutional Network Hybrid Model for Univariate Time Series Classification. International Journal of Advanced Computer Science and Applications 10, 5(2019).Google ScholarGoogle ScholarCross RefCross Ref
  14. Samir Faci. 2020. The Toxicity Of Open Source. https://www.esamir.com/20/12/23/the-toxicity-of-open-source/Google ScholarGoogle Scholar
  15. Isabella Ferreira, Jinghui Cheng, and Bram Adams. 2021. The” Shut the f** k up” Phenomenon: Characterizing Incivility in Open Source Code Review Discussions. Proceedings of the ACM on Human-Computer Interaction 5, CSCW2(2021), 1–35.Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Alex Graves and Jürgen Schmidhuber. 2005. Framewise phoneme classification with bidirectional LSTM and other neural network architectures. Neural networks 18, 5-6 (2005), 602–610.Google ScholarGoogle Scholar
  17. Sanuri Dananja Gunawardena, Peter Devine, Isabelle Beaumont, Lola Garden, Emerson Rex Murphy-Hill, and Kelly Blincoe. 2022. Destructive Criticism in Software Code Review Impacts Inclusion. (2022).Google ScholarGoogle Scholar
  18. Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long short-term memory. Neural computation 9, 8 (1997), 1735–1780.Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Nasif Imtiaz, Justin Middleton, Joymallya Chakraborty, Neill Robson, Gina Bai, and Emerson Murphy-Hill. 2019. Investigating the effects of gender bias on GitHub. In 2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE). IEEE, 700–711.Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Md Rakibul Islam and Minhaz F Zibran. 2018. SentiStrength-SE: Exploiting domain specificity for improved sentiment analysis in software engineering text. Journal of Systems and Software 145 (2018), 125–146.Google ScholarGoogle ScholarCross RefCross Ref
  21. Carlos Jensen, Scott King, and Victor Kuechler. 2011. Joining free/open source software communities: An analysis of newbies’ first interactions on project mailing lists. In 2011 44th Hawaii international conference on system sciences. IEEE, 1–10.Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Rie Johnson and Tong Zhang. 2017. Deep pyramid convolutional neural networks for text categorization. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 562–570.Google ScholarGoogle ScholarCross RefCross Ref
  23. Robbert Jongeling, Proshanta Sarkar, Subhajit Datta, and Alexander Serebrenik. 2017. On negative results when using sentiment analysis tools for software engineering research. Empirical Software Engineering 22, 5 (2017), 2543–2584.Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Robin M Kowalski, Susan P Limber, and Patricia W Agatston. 2012. Cyberbullying: Bullying in the digital age. John Wiley & Sons.Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Deepak Kumar, Patrick Gage Kelley, Sunny Consolvo, Joshua Mason, Elie Bursztein, Zakir Durumeric, Kurt Thomas, and Michael Bailey. 2021. Designing Toxic Content Classification for a Diversity of Perspectives. In Seventeenth Symposium on Usable Privacy and Security (SOUPS 2021). 299–318.Google ScholarGoogle Scholar
  26. Megan Lindsay, Jaime M Booth, Jill T Messing, and Jonel Thaller. 2016. Experiences of online harassment among emerging adults: Emotional reactions and the mediating role of fear. Journal of interpersonal violence 31, 19 (2016), 3174–3195.Google ScholarGoogle ScholarCross RefCross Ref
  27. Courtney Miller, Sophie Cohen, Daniel Klug, Bogdan Vasilescu, and Christian Kästner. 2022. “Did You Miss My Comment or What?” Understanding Toxicity in Open Source Discussions. In In 44th International Conference on Software Engineering (ICSE’22).Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Nicole Novielli, Daniela Girardi, and Filippo Lanubile. 2018. A benchmark study on sentiment analysis for software engineering research. In 2018 IEEE/ACM 15th International Conference on Mining Software Repositories (MSR). IEEE, 364–375.Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Sünje Paasch-Colberg, Christian Strippel, Joachim Trebbe, and Martin Emmer. 2021. From insult to hate speech: Mapping offensive language in German user comments on immigration. Media and Communication 9, 1 (2021), 171–180.Google ScholarGoogle ScholarCross RefCross Ref
  30. Rajshakhar Paul, Amiangshu Bosu, and Kazi Zakia Sultana. 2019. Expressions of sentiments during code reviews: Male vs. female. In 2019 IEEE 26th International Conference on Software Analysis, Evolution and Reengineering (SANER). IEEE, 26–37.Google ScholarGoogle ScholarCross RefCross Ref
  31. Rajshakhar Paul, Amiangshu Bosu, and Kazi Zakia Sultana. 2019. Expressions of Sentiments during Code Reviews: Male vs. Female. In Proceedings of the 26th IEEE International Conference on Software Analysis, Evolution and Reengineering(SANER ‘19). IEEE.Google ScholarGoogle ScholarCross RefCross Ref
  32. Jeffrey Pennington, Richard Socher, and Christopher D Manning. 2014. Glove: Global vectors for word representation. In Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP). 1532–1543.Google ScholarGoogle ScholarCross RefCross Ref
  33. Huilian Sophie Qiu, Yucen Lily Li, Susmita Padala, Anita Sarma, and Bogdan Vasilescu. 2019. The signals that potential contributors look for when choosing open-source projects. Proceedings of the ACM on Human-Computer Interaction 3, CSCW(2019), 1–29.Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Huilian Sophie Qiu, Bogdan Vasilescu, Christian Kästner, Carolyn Egelman, Ciera Jaspan, and Emerson Murphy-Hill. 2022. Detecting Interpersonal Conflict in Issues and Code Review: Cross Pollinating Open-and Closed-Source Approaches. In 2022 IEEE/ACM 44th International Conference on Software Engineering: Software Engineering in Society (ICSE-SEIS). IEEE, 41–55.Google ScholarGoogle Scholar
  35. Naveen Raman, Minxuan Cao, Yulia Tsvetkov, Christian Kästner, and Bogdan Vasilescu. 2020. Stress and burnout in open source: Toward finding, understanding, and mitigating unhealthy interactions. In Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering: New Ideas and Emerging Results. 57–60.Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Kelly Reynolds, April Kontostathis, and Lynne Edwards. 2011. Using machine learning to detect cyberbullying. In 2011 10th International Conference on Machine learning and applications and workshops, Vol. 2. IEEE, 241–244.Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Jaydeb Sarker, Asif Kamal Turzo, and Amiangshu Bosu. 2020. A Benchmark Study of the Contemporary Toxicity Detectors on Software Engineering Interactions. In 2020 27th Asia-Pacific Software Engineering Conference (APSEC). IEEE, 218–227.Google ScholarGoogle ScholarCross RefCross Ref
  38. Jaydeb Sarker, Asif Kamal Turzo, Ming Dong, and Amiangshu Bosu. 2022. Automated Identification of Toxic Code Reviews Using ToxiCR. arXiv preprint arXiv:2202.13056(2022).Google ScholarGoogle Scholar
  39. Anna Schmidt and Michael Wiegand. 2019. A survey on hate speech detection using natural language processing. In Proceedings of the Fifth International Workshop on Natural Language Processing for Social Media, April 3, 2017, Valencia, Spain. Association for Computational Linguistics, 1–10.Google ScholarGoogle Scholar
  40. Megan Squire and Rebecca Gazda. 2015. FLOSS as a Source for Profanity and Insults: Collecting the Data. In 2015 48th Hawaii International Conference on System Sciences. IEEE, 5290–5298.Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Igor Steinmacher and Marco Aurélio Gerosa. 2014. How to support newcomers onboarding to open source software projects. In IFIP International Conference on Open Source Systems. Springer, 199–201.Google ScholarGoogle ScholarCross RefCross Ref
  42. Sayma Sultana and Amiangshu Bosu. 2021. Are Code Review Processes Influenced by the Genders of the Participants?arXiv preprint arXiv:2108.07774(2021).Google ScholarGoogle Scholar
  43. Jun-Ming Xu, Kwang-Sung Jun, Xiaojin Zhu, and Amy Bellmore. 2012. Learning from bullying traces in social media. In Proceedings of the 2012 conference of the North American chapter of the association for computational linguistics: Human language technologies. 656–666.Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Identification and Mitigation of Toxic Communications Among Open Source Software Developers
        Index terms have been assigned to the content through auto-classification.

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Other conferences
          ASE '22: Proceedings of the 37th IEEE/ACM International Conference on Automated Software Engineering
          October 2022
          2006 pages
          ISBN:9781450394758
          DOI:10.1145/3551349

          Copyright © 2022 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 5 January 2023

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article
          • Research
          • Refereed limited

          Acceptance Rates

          Overall Acceptance Rate82of337submissions,24%

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader

        HTML Format

        View this article in HTML Format .

        View HTML Format