ABSTRACT
Open source software (OSS) communities are often able to produce high quality software comparable to proprietary software. The success of an open source software development (OSSD) community is often attributed to the underlying governance model, and a key component of these models is the decision-making (DM) process. While there have been studies on the decision-making processes publicized by OSS communities (e.g., through published process diagrams), little has been done to study decision-making processes that can be extracted using a bottom-up, data-driven approach, which can then be used to assess whether the publicized processes conform to the extracted processes. To bridge this gap, we undertook a large-scale data-driven study to understand how decisions are made in an OSSD community, using the case study of Python Enhancement Proposals (PEPs), which embody decisions made during the evolution of the Python language. Our main contributions are:
(a) the design and development of a framework using information retrieval and natural language processing techniques to analyze the Python email archives (comprising 1.48 million emails), and
(b) the extraction of decision-making processes that reveal activities that are neither explicitly mentioned in documentation published by the Python community nor identified in prior research work. Our results provide insights into the actual decision-making process employed by the Python community.
- Hannah Bast and Elmar Haussmann, 2013. Open information extraction via contextual sentence decomposition. IEEE, City, 2013.Google ScholarDigital Library
- Christian Bird, 2011. Sociotechnical coordination and collaboration in open source software. IEEE, City, 2011.Google ScholarDigital Library
- Anthony Fader, Stephen Soderland and Oren Etzioni, 2011. Identifying relations for open information extraction. Association for Computational Linguistics, City, 2011.Google Scholar
- Sandra L Faulkner and Stormy P Trotter, 2017. Data Saturation. The International Encyclopedia of Communication Research Methods (2017), 1--2.Google Scholar
- Roy T Fielding, 1999. Shared leadership in the Apache project. Communications of the ACM, 42, 4 (1999), 42--43.Google ScholarDigital Library
- Brian Fitzgerald, 2009. Open source software adoption: anatomy of success and failure. International Journal of Open Source Software and Processes (IJOSSP), 1, 1 (2009), 1--23.Google ScholarCross Ref
- Karl Fogel, 2005. Producing open source software: How to run a successful free software project. " O'Reilly Media, Inc.", 2005.Google Scholar
- Luciano Del Corro Rainer Gemulla, 2013. ClausIE: Clause-Based Open Information Extraction (2013).Google Scholar
- Christian W Günther and Anne Rozinat, 2012. Disco: Discover Your Processes. BPM (Demos), 940 (2012), 40--44.Google Scholar
- Kevin A Hallgren, 2012. Computing inter-rater reliability for observational data: an overview and tutorial. Tutorials in quantitative methods for psychology, 8, 1 (2012), 23.Google Scholar
- Sven Ove Hansson, 1994. Decision Theory-- A Brief Introduction (1994).Google Scholar
- Chris Jensen and Walt Scacchi, 2007. Role migration and advancement processes in OSSD projects: A comparative case study. IEEE Computer Society, City, 2007.Google Scholar
- Smitha Keertipati, Sherlock A. Licorish and Bastin Tony Roy Savarimuthu, 2016. Exploring Decision-Making Processes In Python. In Proceedings of the Proceedings of the EASE conference, 1--10, 2016.Google Scholar
- Yuhua Li, Zuhair Bandar, David McLean and James O'Shea, 2004. A Method for Measuring Sentence Similarity and iIts Application to Conversational Agents. City, 2004.Google Scholar
- Sherlock Licorish, Anne Philpott and Stephen G MacDonell, 2009. Supporting agile team composition: A prototype tool for identifying personality (In) compatibilities. IEEE Computer Society, City, 2009.Google ScholarDigital Library
- Richard J Light, 1971. Measures of response agreement for qualitative data: some generalizations and alternatives. Psychological bulletin, 76, 5 (1971), 365.Google Scholar
- Christopher Manning, Mihai Surdeanu, John Bauer, Jenny Finkel, Steven Bethard and David McClosky, 2014. The Stanford CoreNLP natural language processing toolkit. City, 2014.Google Scholar
- Audris Mockus, Roy T Fielding and James D Herbsleb, 2002. Two case studies of open source software development: Apache and Mozilla. ACM Transactions on Software Engineering and Methodology (TOSEM), 11, 3 (2002), 309--346.Google ScholarDigital Library
- Rafał Mrówka, 2015. Decision-making in the process of implementation of open source projects (2015).Google Scholar
- Siobhán O'Mahony and Fabrizio Ferraro, 2007. The emergence of governance in an open source community. Academy of Management Journal, 50, 5 (2007), 1079--1106.Google ScholarCross Ref
- Martin Peterson, 2017. An introduction to decision theory. Cambridge University Press, 2017.Google ScholarCross Ref
- Python (13/06/2000), PEP 1 -- PEP Purpose and Guidelines. https://www.python.org/dev/peps/pep-0001. Accessed 01/12/2019Google Scholar
- Python (30/07/2012), PEP 428 The pathlib module object-oriented filesystem paths. https://www.python.org/dev/peps/pep-0428. Accessed 12/01/2019Google Scholar
- Python (21/02/2003), Python Mailing List Discussions (a). https://mail.python.org/pipermail/python-list/2003-February/212542.html. Accessed 01/12/2019Google Scholar
- Python (13/03/2015), Python Mailing List Discussions (b). https://mail.python.org/pipermail/python-dev/2015-March/138777.html. Accessed 01/12/2019Google Scholar
- Python (15/03/2016), Python Mailing List Discussions (c). https://mail.python.org/pipermail/python-list/2016-March/855566.html. Accessed 12/01/2019Google Scholar
- Python (20/06/2001), Python Mailing List Discussions (d). https://mail.python.org/pipermail/python-dev/2001-June/015487.html. Accessed 01/12/2019Google Scholar
- Python (19/08/2004), Python Mailing List Discussions (e). https://mail.python.org/pipermail/python-dev/2004-August/048048.html. Accessed 01/12/2019Google Scholar
- Python (21/06/2001), Python Mailing List Discussions (f). https://mail.python.org/pipermail/python-dev/2001-June/015500.html. Accessed 01/12/2019Google Scholar
- Python (09/03/2003), Python Mailing List Discussions (g). https://mail.python.org/pipermail/python-announce-list/2003-March/002102.html. Accessed 01/12/2019Google Scholar
- Python (14/01/2010), Python Mailing List Discussions (h). https://mail.python.org/pipermail/python-dev/2010-January/097335.html. Accessed 01/12/2019Google Scholar
- Python (06/03/2002), Python Mailing List Discussions (i). https://mail.python.org/pipermail/python-dev/2002-March/020667.html. Accessed 12/01/2019Google Scholar
- Python (13/08/2003), Python Mailing List Discussions (j). https://mail.python.org/pipermail/python-checkins/2003-August/037477.html. Accessed 01/12/2019Google Scholar
- Python (05/03/2003), Python Mailing List Discussions (k). https://mail.python.org/pipermail/python-list/2003-March/224392.html. Accessed 01/12/2019Google Scholar
- Python (07/03/2002), Python Mailing List Discussions (o). https://mail.python.org/pipermail/python-dev/2002-March/020717.html. Accessed 01/12/2019Google Scholar
- Python (07/12/2009), Python Mailing List Discussions (p). https://mail.python.org/pipermail/distutils-sig/2009-December/014875.html. Accessed 01/12/2019Google Scholar
- Python (11/02/2003), Python Mailing List Discussions (q). https://mail.python.org/pipermail/python-list/2003-February/217419.html. Accessed 01/12/2019Google Scholar
- Tony Savarimuthu, Hoa Khanh Dam, Sherlock Licorish, Smitha Keertipati, Daniel Avery and Aditya K Ghose, 2016. Process Compliance In Open Source Software Development-a Study Of Python Enhancement Proposals (PEPs). In Proceedings of ECIS conference, 2016.Google Scholar
- Michael Schmitz, Robert Bart, Stephen Soderland and Oren Etzioni, 2012. Open language learning for information extraction. Association for Computational Linguistics, City, 2012.Google Scholar
- Pankajeshwara N. Sharma, Bastin Tony Roy Savarimuthu and Nigel Stanger, 2017. Boundary Spanners in Open Source Software Development: A Study of Python Email Archives. In Proceedings of APSEC conference, pp. 308--317, 2017.Google ScholarCross Ref
- Pankajeshwara Sharma, Bastin Tony Roy Savarimuthu, Nigel Stanger, Sherlock A Licorish and Austen Rainer, 2017. Investigating developers' email discussions during decision-making in Python language evolution. In Proceedings of the EASE conference, 286--291, 2017.Google ScholarDigital Library
- DocFetcher Development Team (01/12/2019), Docfetcher. http://docfetcher.sourceforge.net/en/index.html. Accessed 01/12/2019Google Scholar
- Jian-Qiang Wang, Juan-Juan Peng, Hong-Yu Zhang, Tao Liu and Xiao-hong Chen, 2015. An uncertain linguistic multi-criteria group decision-making method based on a cloud model. Group Decision and Negotiation, 24, 1 (2015), 171--192.Google ScholarCross Ref
Index Terms
- Mining Decision-Making Processes in Open Source Software Development: A Study of Python Enhancement Proposals (PEPs) using Email Repositories
Recommendations
Extracting Rationale for Open Source Software Development Decisions: A Study of Python Email Archives
ICSE '21: Proceedings of the 43rd International Conference on Software EngineeringA sound Decision-Making (DM) process is key to the successful governance of software projects. In many Open Source Software Development (OSSD) communities, DM processes lie buried amongst vast amounts of publicly available data. Hidden within this data ...
Open Source Software Development Process Model: A Grounded Theory Approach
The global open source movement has provided software users with more choices, lower software acquisition cost, more flexible software customization, and possibly higher quality software product. Although the development of open source software is ...
How are decisions made in open source software communities? — Uncovering rationale from python email repositories
AbstractGroup decision‐making (GDM) processes shape the evolution of open source software (OSS) products, thus playing an important role in the governance of open source software communities. While these GDM processes have attracted the attention of ...
We carry out empirical study to extract rationales for decision‐making in Open Source Software (OSS) communities. Based on the patterns in how these rationale are stated in Python, we present Rationale Miner, a heuristics‐based tool that we used to ...
Comments