skip to main content
10.1145/3383219.3383240acmotherconferencesArticle/Chapter ViewAbstractPublication PageseaseConference Proceedingsconference-collections
research-article

Mining Decision-Making Processes in Open Source Software Development: A Study of Python Enhancement Proposals (PEPs) using Email Repositories

Published:17 April 2020Publication History

ABSTRACT

Open source software (OSS) communities are often able to produce high quality software comparable to proprietary software. The success of an open source software development (OSSD) community is often attributed to the underlying governance model, and a key component of these models is the decision-making (DM) process. While there have been studies on the decision-making processes publicized by OSS communities (e.g., through published process diagrams), little has been done to study decision-making processes that can be extracted using a bottom-up, data-driven approach, which can then be used to assess whether the publicized processes conform to the extracted processes. To bridge this gap, we undertook a large-scale data-driven study to understand how decisions are made in an OSSD community, using the case study of Python Enhancement Proposals (PEPs), which embody decisions made during the evolution of the Python language. Our main contributions are:

(a) the design and development of a framework using information retrieval and natural language processing techniques to analyze the Python email archives (comprising 1.48 million emails), and

(b) the extraction of decision-making processes that reveal activities that are neither explicitly mentioned in documentation published by the Python community nor identified in prior research work. Our results provide insights into the actual decision-making process employed by the Python community.

References

  1. Hannah Bast and Elmar Haussmann, 2013. Open information extraction via contextual sentence decomposition. IEEE, City, 2013.Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Christian Bird, 2011. Sociotechnical coordination and collaboration in open source software. IEEE, City, 2011.Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Anthony Fader, Stephen Soderland and Oren Etzioni, 2011. Identifying relations for open information extraction. Association for Computational Linguistics, City, 2011.Google ScholarGoogle Scholar
  4. Sandra L Faulkner and Stormy P Trotter, 2017. Data Saturation. The International Encyclopedia of Communication Research Methods (2017), 1--2.Google ScholarGoogle Scholar
  5. Roy T Fielding, 1999. Shared leadership in the Apache project. Communications of the ACM, 42, 4 (1999), 42--43.Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Brian Fitzgerald, 2009. Open source software adoption: anatomy of success and failure. International Journal of Open Source Software and Processes (IJOSSP), 1, 1 (2009), 1--23.Google ScholarGoogle ScholarCross RefCross Ref
  7. Karl Fogel, 2005. Producing open source software: How to run a successful free software project. " O'Reilly Media, Inc.", 2005.Google ScholarGoogle Scholar
  8. Luciano Del Corro Rainer Gemulla, 2013. ClausIE: Clause-Based Open Information Extraction (2013).Google ScholarGoogle Scholar
  9. Christian W Günther and Anne Rozinat, 2012. Disco: Discover Your Processes. BPM (Demos), 940 (2012), 40--44.Google ScholarGoogle Scholar
  10. Kevin A Hallgren, 2012. Computing inter-rater reliability for observational data: an overview and tutorial. Tutorials in quantitative methods for psychology, 8, 1 (2012), 23.Google ScholarGoogle Scholar
  11. Sven Ove Hansson, 1994. Decision Theory-- A Brief Introduction (1994).Google ScholarGoogle Scholar
  12. Chris Jensen and Walt Scacchi, 2007. Role migration and advancement processes in OSSD projects: A comparative case study. IEEE Computer Society, City, 2007.Google ScholarGoogle Scholar
  13. Smitha Keertipati, Sherlock A. Licorish and Bastin Tony Roy Savarimuthu, 2016. Exploring Decision-Making Processes In Python. In Proceedings of the Proceedings of the EASE conference, 1--10, 2016.Google ScholarGoogle Scholar
  14. Yuhua Li, Zuhair Bandar, David McLean and James O'Shea, 2004. A Method for Measuring Sentence Similarity and iIts Application to Conversational Agents. City, 2004.Google ScholarGoogle Scholar
  15. Sherlock Licorish, Anne Philpott and Stephen G MacDonell, 2009. Supporting agile team composition: A prototype tool for identifying personality (In) compatibilities. IEEE Computer Society, City, 2009.Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Richard J Light, 1971. Measures of response agreement for qualitative data: some generalizations and alternatives. Psychological bulletin, 76, 5 (1971), 365.Google ScholarGoogle Scholar
  17. Christopher Manning, Mihai Surdeanu, John Bauer, Jenny Finkel, Steven Bethard and David McClosky, 2014. The Stanford CoreNLP natural language processing toolkit. City, 2014.Google ScholarGoogle Scholar
  18. Audris Mockus, Roy T Fielding and James D Herbsleb, 2002. Two case studies of open source software development: Apache and Mozilla. ACM Transactions on Software Engineering and Methodology (TOSEM), 11, 3 (2002), 309--346.Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Rafał Mrówka, 2015. Decision-making in the process of implementation of open source projects (2015).Google ScholarGoogle Scholar
  20. Siobhán O'Mahony and Fabrizio Ferraro, 2007. The emergence of governance in an open source community. Academy of Management Journal, 50, 5 (2007), 1079--1106.Google ScholarGoogle ScholarCross RefCross Ref
  21. Martin Peterson, 2017. An introduction to decision theory. Cambridge University Press, 2017.Google ScholarGoogle ScholarCross RefCross Ref
  22. Python (13/06/2000), PEP 1 -- PEP Purpose and Guidelines. https://www.python.org/dev/peps/pep-0001. Accessed 01/12/2019Google ScholarGoogle Scholar
  23. Python (30/07/2012), PEP 428 The pathlib module object-oriented filesystem paths. https://www.python.org/dev/peps/pep-0428. Accessed 12/01/2019Google ScholarGoogle Scholar
  24. Python (21/02/2003), Python Mailing List Discussions (a). https://mail.python.org/pipermail/python-list/2003-February/212542.html. Accessed 01/12/2019Google ScholarGoogle Scholar
  25. Python (13/03/2015), Python Mailing List Discussions (b). https://mail.python.org/pipermail/python-dev/2015-March/138777.html. Accessed 01/12/2019Google ScholarGoogle Scholar
  26. Python (15/03/2016), Python Mailing List Discussions (c). https://mail.python.org/pipermail/python-list/2016-March/855566.html. Accessed 12/01/2019Google ScholarGoogle Scholar
  27. Python (20/06/2001), Python Mailing List Discussions (d). https://mail.python.org/pipermail/python-dev/2001-June/015487.html. Accessed 01/12/2019Google ScholarGoogle Scholar
  28. Python (19/08/2004), Python Mailing List Discussions (e). https://mail.python.org/pipermail/python-dev/2004-August/048048.html. Accessed 01/12/2019Google ScholarGoogle Scholar
  29. Python (21/06/2001), Python Mailing List Discussions (f). https://mail.python.org/pipermail/python-dev/2001-June/015500.html. Accessed 01/12/2019Google ScholarGoogle Scholar
  30. Python (09/03/2003), Python Mailing List Discussions (g). https://mail.python.org/pipermail/python-announce-list/2003-March/002102.html. Accessed 01/12/2019Google ScholarGoogle Scholar
  31. Python (14/01/2010), Python Mailing List Discussions (h). https://mail.python.org/pipermail/python-dev/2010-January/097335.html. Accessed 01/12/2019Google ScholarGoogle Scholar
  32. Python (06/03/2002), Python Mailing List Discussions (i). https://mail.python.org/pipermail/python-dev/2002-March/020667.html. Accessed 12/01/2019Google ScholarGoogle Scholar
  33. Python (13/08/2003), Python Mailing List Discussions (j). https://mail.python.org/pipermail/python-checkins/2003-August/037477.html. Accessed 01/12/2019Google ScholarGoogle Scholar
  34. Python (05/03/2003), Python Mailing List Discussions (k). https://mail.python.org/pipermail/python-list/2003-March/224392.html. Accessed 01/12/2019Google ScholarGoogle Scholar
  35. Python (07/03/2002), Python Mailing List Discussions (o). https://mail.python.org/pipermail/python-dev/2002-March/020717.html. Accessed 01/12/2019Google ScholarGoogle Scholar
  36. Python (07/12/2009), Python Mailing List Discussions (p). https://mail.python.org/pipermail/distutils-sig/2009-December/014875.html. Accessed 01/12/2019Google ScholarGoogle Scholar
  37. Python (11/02/2003), Python Mailing List Discussions (q). https://mail.python.org/pipermail/python-list/2003-February/217419.html. Accessed 01/12/2019Google ScholarGoogle Scholar
  38. Tony Savarimuthu, Hoa Khanh Dam, Sherlock Licorish, Smitha Keertipati, Daniel Avery and Aditya K Ghose, 2016. Process Compliance In Open Source Software Development-a Study Of Python Enhancement Proposals (PEPs). In Proceedings of ECIS conference, 2016.Google ScholarGoogle Scholar
  39. Michael Schmitz, Robert Bart, Stephen Soderland and Oren Etzioni, 2012. Open language learning for information extraction. Association for Computational Linguistics, City, 2012.Google ScholarGoogle Scholar
  40. Pankajeshwara N. Sharma, Bastin Tony Roy Savarimuthu and Nigel Stanger, 2017. Boundary Spanners in Open Source Software Development: A Study of Python Email Archives. In Proceedings of APSEC conference, pp. 308--317, 2017.Google ScholarGoogle ScholarCross RefCross Ref
  41. Pankajeshwara Sharma, Bastin Tony Roy Savarimuthu, Nigel Stanger, Sherlock A Licorish and Austen Rainer, 2017. Investigating developers' email discussions during decision-making in Python language evolution. In Proceedings of the EASE conference, 286--291, 2017.Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. DocFetcher Development Team (01/12/2019), Docfetcher. http://docfetcher.sourceforge.net/en/index.html. Accessed 01/12/2019Google ScholarGoogle Scholar
  43. Jian-Qiang Wang, Juan-Juan Peng, Hong-Yu Zhang, Tao Liu and Xiao-hong Chen, 2015. An uncertain linguistic multi-criteria group decision-making method based on a cloud model. Group Decision and Negotiation, 24, 1 (2015), 171--192.Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. Mining Decision-Making Processes in Open Source Software Development: A Study of Python Enhancement Proposals (PEPs) using Email Repositories

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Other conferences
      EASE '20: Proceedings of the 24th International Conference on Evaluation and Assessment in Software Engineering
      April 2020
      544 pages
      ISBN:9781450377317
      DOI:10.1145/3383219
      • General Chairs:
      • Jingyue Li,
      • Letizia Jaccheri,
      • Program Chairs:
      • Torgeir Dingsøyr,
      • Ruzanna Chitchyan

      Copyright © 2020 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 17 April 2020

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article
      • Research
      • Refereed limited

      Acceptance Rates

      Overall Acceptance Rate71of232submissions,31%

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader