ABSTRACT
Pull-based development has become a popular choice for developing distributed projects, such as those hosted on GitHub. In this model, contributions are pulled from forked repositories, modified, and then later merged back into the main repository. In this work, we report on two empirical studies that investigate pull request (PR) merges of Active Merchant, a commercial project developed by Shopify Inc. In the first study, we apply data mining techniques on the project's GitHub repository to explore the nature of merges, and we conduct a manual inspection of pull requests; we also investigate what factors contribute to PR merge time and outcome. In the second study, we perform a qualitative analysis of the results of a survey of developers who contributed to Active Merchant. The study addresses the topic of PR review quality and developers' perception of it. The results provide insights into how these developers perform pull request merges, and what factors they find contribute to how they review and merge pull requests.
- Alberto Bacchelli and Christian Bird. 2013. Expectations, Outcomes, and Challenges of Modern Code Review. In Proc. of the Int. Conf. on Soft. Eng. 712--721. Google ScholarDigital Library
- Olga Baysal, Oleksii Kononenko, Reid Holmes, and Michael W Godfrey. 2012. The secret life of patches: A firefox case study. In Proc. of the Working Conf. on Reverse Eng. 447--455. Google ScholarDigital Library
- Olga Baysal, Oleksii Kononenko, Reid Holmes, and Michael W. Godfrey. 2016. Investigating technical and non-technical factors influencing modern code review. Empirical Soft. Eng. 21, 3 (2016), 932--959. Google ScholarDigital Library
- Christian Bird, Alex Gourley, Prem Devanbu, Anand Swaminathan, and Greta Hsu. 2007. Open borders? immigration in open source projects. In Proc. of the Int. Workshop on Mining Soft. Repositories. Google ScholarDigital Library
- Marcelo Cataldo, Audris Mockus, Jeffrey A Roberts, and James D Herbsleb. 2009. Software dependencies, work dependencies, and their impact on failures. IEEE Transactions on Soft. Eng. 35, 6 (2009), 864--878. Google ScholarDigital Library
- Rune Haubo B Christensen and Merete K Hansen. 2011. binomTools: Performing diagnostics on binomial regression models. R package version 1.0-1.Google Scholar
- J. Cohen. 2003. Applied Multiple Regression - Correlation Analysis for the Behavioral Sciences.Google Scholar
- J. Fox. 2008. Applied Regression Analysis and Generalized Linear Models.Google Scholar
- John Fox and Sanford Weisberg. 2011. An R Companion to Applied Regression (second ed.).Google Scholar
- Deen Freelon. {n. d.}. ReCal2: Reliability for 2 Coders. http://dfreelon.org/utils/recalfront/recal2/. ({n. d.}).Google Scholar
- Georgios Gousios, Martin Pinzger, and Arie van Deursen. 2014. An exploratory study of the pull-based software development model. In Proc. of the Int. Conf. on Soft. Eng. 345--355. Google ScholarDigital Library
- Georgios Gousios, Margaret-Anne Storey, and Alberto Bacchelli. 2016. Work Practices and Challenges in Pull-based Development: The Contributor's Perspective. In Proc. of the Int. Conf. on Soft. Eng. 285--296. Google ScholarDigital Library
- Georgios Gousios, Andy Zaidman, Margaret-Anne Storey, and Arie Van Deursen. 2015. Work practices and challenges in pull-based development: the integrator's perspective. In Proc. of the Int. Conf. on Soft. Eng. 358--368. Google ScholarDigital Library
- R.M. Groves, F.J. Fowler, M.P. Couper, J.M. Lepkowski, E. Singer, and R. Tourangeau. 2009. Survey Methodology (2 ed.).Google Scholar
- Trevor Hastie, Robert Tibshirani, and Jerome Friedman. 2009. The elements of statistical learning: data mining, inference and prediction (2 ed.).Google Scholar
- Les Hatton. 2008. Testing the Value of Checklists in Code Inspections. IEEE Software 25, 4 (2008), 82--88. Google ScholarDigital Library
- Yujuan Jiang, Bram Adams, and Daniel M. German. 2013. Will My Patch Make It? And How Fast?: Case Study on the Linux Kernel. In Proc. of Working Conf. on Mining Soft. Repos. 101--110. Google ScholarDigital Library
- Eirini Kalliamvakou, Georgios Gousios, Kelly Blincoe, Leif Singer, Daniel M German, and Daniela Damian. 2014. The promises and perils of mining GitHub. In Proc. of the Working Conf. on mining Soft. Repositories. ACM, 92--101. Google ScholarDigital Library
- Chris F. Kemerer and Mark C. Paulk. 2009. The Impact of Design and Code Reviews on Software Quality: An Empirical Study Based on PSP Data. IEEE Trans. Soft. Eng. 35, 4 (July 2009), 534--550. Google ScholarDigital Library
- Oleksii Kononenko, Olga Baysal, and Michael W Godfrey. 2016. Code review quality: how developers see it. In Proc. of the Int. Conf. on Soft. Eng. 1028--1038. Google ScholarDigital Library
- Oleksii Kononenko, Olga Baysal, Latifa Guerrouj, Yaxin Cao, and Michael W. Godfrey. 2015. Investigating Code Review Quality: Do People and Participation Matter?. In Proc. of the Int. Conf. on Soft. Maintenance and Evolution. 111--120. Google ScholarDigital Library
- William H. Kruskal and W. Allen Wallis. 1952. Use of Ranks in One-Criterion Variance Analysis. Journ. of the American Statistical Ass. 47, 260 (1952), 583--621.Google ScholarCross Ref
- E.L. Lehmann and H.J.M. D'Abrera. 2006. Nonparametrics: statistical methods based on ranks.Google Scholar
- Jennifer Marlow, Laura Dabbish, and Jim Herbsleb. 2013. Impression Formation in Online Peer Production: Activity Traces and Personal Profiles in Github. In Proc. of the Conf. on Computer Supported Cooperative Work. 117--128. Google ScholarDigital Library
- Shane McIntosh, Yasutaka Kamei, Bram Adams, and Ahmed E. Hassan. 2014. The Impact of Code Review Coverage and Code Review Participation on Software Quality: A Case Study of the Qt, VTK, and ITK Projects. In Proc. of the Working Conf. on Mining Soft. Repos. 192--201. Google ScholarDigital Library
- Shane Mcintosh, Yasutaka Kamei, Bram Adams, and Ahmed E. Hassan. 2016. An Empirical Study of the Impact of Modern Code Review Practices on Software Quality. Empirical Soft. Eng. 21, 5 (Oct. 2016), 2146--2189. Google ScholarDigital Library
- M.B. Miles and A.M. Huberman. 1994. Qualitative Data Analysis: An Expanded Sourcebook.Google Scholar
- Audris Mockus. 2010. Organizational volatility and its effects on software defects. In Proc. of the Int. Symposium on Foundations of Soft. Eng. 117--126. Google ScholarDigital Library
- Audris Mockus, Roy T Fielding, and James D Herbsleb. 2002. Two case studies of open source software development: Apache and Mozilla. ACM Transactions on Soft. Eng. and Methodology 11, 3 (2002), 309--346. Google ScholarDigital Library
- Peter C. Rigbyand Christian Bird. 2013. Convergent Contemporary Software Peer Review Practices. In Proc. of the Joint Meeting on Foundations of Soft. Eng. 202--212. Google ScholarDigital Library
- Peter C Rigby and Daniel M German. 2006. A preliminary examination of code review processes in open source projects. Technical Report. DCS-305-IR, University of Victoria.Google Scholar
- Peter C. Rigby and Margaret-Anne Storey. 2011. Understanding Broadcast Based Peer Review on Open Source Software Projects. In Proc. of the Int. Conf. on Soft. Eng. 541--550. Google ScholarDigital Library
- Shopify. 2017. Active Merchant. https://github.com/activemerchant/active_merchant/. (2017).Google Scholar
- Tue Tjur. 2009. Coefficients of determination in logistic regression models - A new proposal: The coefficient of discrimination. The American Statistician 63, 4 (2009), 366--372.Google ScholarCross Ref
- Jason Tsay, Laura Dabbish, and James Herbsleb. 2014. Influence of Social and Technical Factors for Evaluating Contribution in GitHub. In Proc. of the 36th Int. Conf. on Soft. Eng. 356--366. Google ScholarDigital Library
- Peter Weissgerber, Daniel Neu, and Stephan Diehl. 2008. Small patches get in!. In Proc. of the Working Conf. on Mining Soft. Repos. 67--76. Google ScholarDigital Library
Index Terms
- Studying pull request merges: a case study of shopify's active merchant
Recommendations
An exploratory study of the pull-based software development model
ICSE 2014: Proceedings of the 36th International Conference on Software EngineeringThe advent of distributed version control systems has led to the development of a new paradigm for distributed software development; instead of pushing changes to a central repository, developers pull them from other repositories and merge them ...
Acceptance factors of pull requests in open-source projects
SAC '15: Proceedings of the 30th Annual ACM Symposium on Applied ComputingDistributed version control systems provide support for pull request strategy, which is used to register external contributions in collaborative software projects. The data present on a pull request can provide insights of factors that have influence on ...
Pull request latency explained: an empirical overview
AbstractPull request latency evaluation is an essential application of effort evaluation in the pull-based development scenario. It can help the reviewers sort the pull request queue, remind developers about the review processing time, speed up the review ...
Comments