research-article

Identification and Mitigation of Toxic Communications Among Open Source Software Developers

Author:
Jaydeb Sarker

Wayne State University, United States

Wayne State University, United States
View Profile

ASE '22: Proceedings of the 37th IEEE/ACM International Conference on Automated Software EngineeringOctober 2022Article No.: 124Pages 1–5https://doi.org/10.1145/3551349.3559570

Published:05 January 2023Publication History

ASE '22: Proceedings of the 37th IEEE/ACM International Conference on Automated Software Engineering

Pages 1–5

ABSTRACT

Toxic and unhealthy conversations during the developer’s communication may reduce the professional harmony and productivity of Free and Open Source Software (FOSS) projects. For example, toxic code review comments may raise pushback from an author to complete suggested changes. A toxic communication with another person may hamper future communication and collaboration. Research also suggests that toxicity disproportionately impacts newcomers, women, and other participants from marginalized groups. Therefore, toxicity is a barrier to promote diversity, equity, and inclusion. Since the occurrence of toxic communications is not uncommon among FOSS communities and such communications may have serious repercussions, the primary objective of my proposed dissertation is to automatically identify and mitigate toxicity during developers’ textual interactions. On this goal, I aim to: i) build an automated toxicity detector for Software Engineering (SE) domain, ii) identify the notion of toxicity across demographics, and iii) analyze the impacts of toxicity on the outcomes of Open Source Software (OSS) projects.

References

Toufique Ahmed, Amiangshu Bosu, Anindya Iqbal, and Shahram Rahimi. 2017. SentiCR: a customized sentiment analysis tool for code review interactions. In 2017 32nd IEEE/ACM International Conference on Automated Software Engineering (ASE). IEEE, 106–111.Google ScholarCross Ref
Conversation AI. [n.d.]. What if technology could help improve conversations online?https://www.perspectiveapi.com/Google Scholar
Anonymous. 2014. Leaving Toxic Open Source Communities. https://modelviewculture.com/pieces/leaving-toxic-open-source-communitiesGoogle Scholar
Luke Breitfeller, Emily Ahn, David Jurgens, and Yulia Tsvetkov. 2019. Finding microaggressions in the wild: A case for locating elusive phenomena in social media posts. In Proceedings of the 2019 conference on empirical methods in natural language processing and the 9th international joint conference on natural language processing (EMNLP-IJCNLP). 1664–1674.Google ScholarCross Ref
Fabio Calefato, Filippo Lanubile, Federico Maiorano, and Nicole Novielli. 2018. Sentiment polarity detection for software development. Empirical Software Engineering 23, 3 (2018), 1352–1382.Google ScholarDigital Library
Jithin Cheriyan, Bastin Tony Roy Savarimuthu, and Stephen Cranefield. 2021. Towards offensive language detection and reduction in four Software Engineering communities. In Evaluation and Assessment in Software Engineering. 254–259.Google ScholarDigital Library
Cristian Danescu-Niculescu-Mizil, Moritz Sudhof, Dan Jurafsky, Jure Leskovec, and Christopher Potts. 2013. A computational approach to politeness with application to social factors. arXiv preprint arXiv:1306.6078(2013).Google Scholar
R Van Wendel De Joode. 2004. Managing conflicts in open source communities. Electronic Markets 14, 2 (2004), 104–113.Google ScholarCross Ref
Fabio Del Vigna12, Andrea Cimino23, Felice Dell’Orletta, Marinella Petrocchi, and Maurizio Tesconi. 2017. Hate me, hate me not: Hate speech detection on facebook. In Proceedings of the First Italian Conference on Cybersecurity (ITASEC17). 86–95.Google Scholar
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. (June 2019), 4171–4186. https://doi.org/10.18653/v1/N19-1423Google Scholar
Carolyn D Egelman, Emerson Murphy-Hill, Elizabeth Kammer, Margaret Morrow Hodges, Collin Green, Ciera Jaspan, and James Lin. 2020. Predicting developers’ negative feelings about code review. In 2020 IEEE/ACM 42nd International Conference on Software Engineering (ICSE). IEEE, 174–185.Google ScholarDigital Library
Ikram El Asri, Noureddine Kerzazi, Gias Uddin, Foutse Khomh, and MA Janati Idrissi. 2019. An empirical study of sentiments in code reviews. Information and Software Technology 114 (2019), 37–54.Google ScholarDigital Library
Nelly Elsayed, Anthony S Maida, and Magdy Bayoumi. 2019. Deep Gated Recurrent and Convolutional Network Hybrid Model for Univariate Time Series Classification. International Journal of Advanced Computer Science and Applications 10, 5(2019).Google ScholarCross Ref
Samir Faci. 2020. The Toxicity Of Open Source. https://www.esamir.com/20/12/23/the-toxicity-of-open-source/Google Scholar
Isabella Ferreira, Jinghui Cheng, and Bram Adams. 2021. The” Shut the f** k up” Phenomenon: Characterizing Incivility in Open Source Code Review Discussions. Proceedings of the ACM on Human-Computer Interaction 5, CSCW2(2021), 1–35.Google ScholarDigital Library
Alex Graves and Jürgen Schmidhuber. 2005. Framewise phoneme classification with bidirectional LSTM and other neural network architectures. Neural networks 18, 5-6 (2005), 602–610.Google Scholar
Sanuri Dananja Gunawardena, Peter Devine, Isabelle Beaumont, Lola Garden, Emerson Rex Murphy-Hill, and Kelly Blincoe. 2022. Destructive Criticism in Software Code Review Impacts Inclusion. (2022).Google Scholar
Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long short-term memory. Neural computation 9, 8 (1997), 1735–1780.Google ScholarDigital Library
Nasif Imtiaz, Justin Middleton, Joymallya Chakraborty, Neill Robson, Gina Bai, and Emerson Murphy-Hill. 2019. Investigating the effects of gender bias on GitHub. In 2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE). IEEE, 700–711.Google ScholarDigital Library
Md Rakibul Islam and Minhaz F Zibran. 2018. SentiStrength-SE: Exploiting domain specificity for improved sentiment analysis in software engineering text. Journal of Systems and Software 145 (2018), 125–146.Google ScholarCross Ref
Carlos Jensen, Scott King, and Victor Kuechler. 2011. Joining free/open source software communities: An analysis of newbies’ first interactions on project mailing lists. In 2011 44th Hawaii international conference on system sciences. IEEE, 1–10.Google ScholarDigital Library
Rie Johnson and Tong Zhang. 2017. Deep pyramid convolutional neural networks for text categorization. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 562–570.Google ScholarCross Ref
Robbert Jongeling, Proshanta Sarkar, Subhajit Datta, and Alexander Serebrenik. 2017. On negative results when using sentiment analysis tools for software engineering research. Empirical Software Engineering 22, 5 (2017), 2543–2584.Google ScholarDigital Library
Robin M Kowalski, Susan P Limber, and Patricia W Agatston. 2012. Cyberbullying: Bullying in the digital age. John Wiley & Sons.Google ScholarDigital Library
Deepak Kumar, Patrick Gage Kelley, Sunny Consolvo, Joshua Mason, Elie Bursztein, Zakir Durumeric, Kurt Thomas, and Michael Bailey. 2021. Designing Toxic Content Classification for a Diversity of Perspectives. In Seventeenth Symposium on Usable Privacy and Security (SOUPS 2021). 299–318.Google Scholar
Megan Lindsay, Jaime M Booth, Jill T Messing, and Jonel Thaller. 2016. Experiences of online harassment among emerging adults: Emotional reactions and the mediating role of fear. Journal of interpersonal violence 31, 19 (2016), 3174–3195.Google ScholarCross Ref
Courtney Miller, Sophie Cohen, Daniel Klug, Bogdan Vasilescu, and Christian Kästner. 2022. “Did You Miss My Comment or What?” Understanding Toxicity in Open Source Discussions. In In 44th International Conference on Software Engineering (ICSE’22).Google ScholarDigital Library
Nicole Novielli, Daniela Girardi, and Filippo Lanubile. 2018. A benchmark study on sentiment analysis for software engineering research. In 2018 IEEE/ACM 15th International Conference on Mining Software Repositories (MSR). IEEE, 364–375.Google ScholarDigital Library
Sünje Paasch-Colberg, Christian Strippel, Joachim Trebbe, and Martin Emmer. 2021. From insult to hate speech: Mapping offensive language in German user comments on immigration. Media and Communication 9, 1 (2021), 171–180.Google ScholarCross Ref
Rajshakhar Paul, Amiangshu Bosu, and Kazi Zakia Sultana. 2019. Expressions of sentiments during code reviews: Male vs. female. In 2019 IEEE 26th International Conference on Software Analysis, Evolution and Reengineering (SANER). IEEE, 26–37.Google ScholarCross Ref
Rajshakhar Paul, Amiangshu Bosu, and Kazi Zakia Sultana. 2019. Expressions of Sentiments during Code Reviews: Male vs. Female. In Proceedings of the 26th IEEE International Conference on Software Analysis, Evolution and Reengineering(SANER ‘19). IEEE.Google ScholarCross Ref
Jeffrey Pennington, Richard Socher, and Christopher D Manning. 2014. Glove: Global vectors for word representation. In Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP). 1532–1543.Google ScholarCross Ref
Huilian Sophie Qiu, Yucen Lily Li, Susmita Padala, Anita Sarma, and Bogdan Vasilescu. 2019. The signals that potential contributors look for when choosing open-source projects. Proceedings of the ACM on Human-Computer Interaction 3, CSCW(2019), 1–29.Google ScholarDigital Library
Huilian Sophie Qiu, Bogdan Vasilescu, Christian Kästner, Carolyn Egelman, Ciera Jaspan, and Emerson Murphy-Hill. 2022. Detecting Interpersonal Conflict in Issues and Code Review: Cross Pollinating Open-and Closed-Source Approaches. In 2022 IEEE/ACM 44th International Conference on Software Engineering: Software Engineering in Society (ICSE-SEIS). IEEE, 41–55.Google Scholar
Naveen Raman, Minxuan Cao, Yulia Tsvetkov, Christian Kästner, and Bogdan Vasilescu. 2020. Stress and burnout in open source: Toward finding, understanding, and mitigating unhealthy interactions. In Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering: New Ideas and Emerging Results. 57–60.Google ScholarDigital Library
Kelly Reynolds, April Kontostathis, and Lynne Edwards. 2011. Using machine learning to detect cyberbullying. In 2011 10th International Conference on Machine learning and applications and workshops, Vol. 2. IEEE, 241–244.Google ScholarDigital Library
Jaydeb Sarker, Asif Kamal Turzo, and Amiangshu Bosu. 2020. A Benchmark Study of the Contemporary Toxicity Detectors on Software Engineering Interactions. In 2020 27th Asia-Pacific Software Engineering Conference (APSEC). IEEE, 218–227.Google ScholarCross Ref
Jaydeb Sarker, Asif Kamal Turzo, Ming Dong, and Amiangshu Bosu. 2022. Automated Identification of Toxic Code Reviews Using ToxiCR. arXiv preprint arXiv:2202.13056(2022).Google Scholar
Anna Schmidt and Michael Wiegand. 2019. A survey on hate speech detection using natural language processing. In Proceedings of the Fifth International Workshop on Natural Language Processing for Social Media, April 3, 2017, Valencia, Spain. Association for Computational Linguistics, 1–10.Google Scholar
Megan Squire and Rebecca Gazda. 2015. FLOSS as a Source for Profanity and Insults: Collecting the Data. In 2015 48th Hawaii International Conference on System Sciences. IEEE, 5290–5298.Google ScholarDigital Library
Igor Steinmacher and Marco Aurélio Gerosa. 2014. How to support newcomers onboarding to open source software projects. In IFIP International Conference on Open Source Systems. Springer, 199–201.Google ScholarCross Ref
Sayma Sultana and Amiangshu Bosu. 2021. Are Code Review Processes Influenced by the Genders of the Participants?arXiv preprint arXiv:2108.07774(2021).Google Scholar
Jun-Ming Xu, Kwang-Sung Jun, Xiaojin Zhu, and Amy Bellmore. 2012. Learning from bullying traces in social media. In Proceedings of the 2012 conference of the North American chapter of the association for computational linguistics: Human language technologies. 656–666.Google ScholarDigital Library

Index Terms

Identification and Mitigation of Toxic Communications Among Open Source Software Developers

Index terms have been assigned to the content through auto-classification.

Recommendations

‘Who built this crap?’ Developing a Software Engineering Domain Specific Toxicity Detector
ASE '22: Proceedings of the 37th IEEE/ACM International Conference on Automated Software Engineering

Since toxicity during developers’ interactions in open source software (OSS) projects show negative impacts on developers’ relation, a toxicity detector for the Software Engineering (SE) domain is needed. However, prior studies found that contemporary ...
Read More
Do software developers understand open source licenses?
ICPC '17: Proceedings of the 25th International Conference on Program Comprehension

Software provided under open source licenses is widely used, from forming high-profile stand-alone applications (e.g., Mozilla Firefox) to being embedded in commercial offerings (e.g., network routers). Despite the high frequency of use of open source ...
Read More
Open source software licenses: Strong-copyleft, non-copyleft, or somewhere in between?

Studies on open source software (OSS) have shown that the license under which an OSS is released has an impact on the success or failure of the software. In this paper, we model the relationship between an OSS developer's utility, the effort that goes ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in

ASE '22: Proceedings of the 37th IEEE/ACM International Conference on Automated Software Engineering
October 2022
2006 pages
ISBN:9781450394758
DOI:10.1145/3551349

Copyright © 2022 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 5 January 2023
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
NLP
deep learning
developers’ interactions
toxicity
Qualifiers
- research-article
- Research
- Refereed limited
Conference

Acceptance Rates
Overall Acceptance Rate82of337submissions,24%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 1
  Total Citations
  View Citations
- 198
  Total Downloads
- Downloads (Last 12 months)134
- Downloads (Last 6 weeks)11
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format .

View HTML Format

Identification and Mitigation of Toxic Communications Among Open Source Software Developers

ASE '22: Proceedings of the 37th IEEE/ACM International Conference on Automated Software Engineering

ABSTRACT

References

Cited By

Index Terms

Recommendations

‘Who built this crap?’ Developing a Software Engineering Domain Specific Toxicity Detector

Do software developers understand open source licenses?

Open source software licenses: Strong-copyleft, non-copyleft, or somewhere in between?

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

HTML Format

Caption

Identification and Mitigation of Toxic Communications Among Open Source Software Developers

ASE '22: Proceedings of the 37th IEEE/ACM International Conference on Automated Software Engineering

ABSTRACT

References

Cited By

Index Terms

Recommendations

‘Who built this crap?’ Developing a Software Engineering Domain Specific Toxicity Detector

Do software developers understand open source licenses?

Open source software licenses: Strong-copyleft, non-copyleft, or somewhere in between?

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

HTML Format

Share this Publication link

Share on Social Media