ABSTRACT
Continuous Integration (CI) is a software development practice that leads developers to integrate their work more frequently. Software projects have broadly adopted CI to ship new releases more frequently and to improve code integration. The adoption of CI is motivated by the allure of delivering new functionalities more quickly. However, there is little empirical evidence to support such a claim. Through the analysis of 162,653 pull requests (PRs) of 87 GitHub projects that are implemented in 5 different programming languages, we empirically investigate the impact of adopting CI on the time to deliver merged PRs. Surprisingly, only 51.3% of the projects deliver merged PRs more quickly after adopting CI. We also observe that the large increase of PR submissions after CI is a key reason as to why projects deliver PRs more slowly after adopting CI. To investigate the factors that are related to the time-to-delivery of merged PRs, we train regression models that obtain sound median R-squares of 0.64-0.67. Finally, a deeper analysis of our models indicates that, before the adoption of CI, the integration-load of the development team, i.e., the number of submitted PRs competing for being merged, is the most impactful metric on the time to deliver merged PRs before CI. Our models also reveal that PRs that are merged more recently in a release cycle experience a slower delivery time.
- Kent Beck. 2000. Extreme Programming Explained: Embrace Change. Addison-Wesley Professional. Google ScholarDigital Library
- DJ Best and DE Roberts. 1975. Algorithm AS 89: the upper tail probabilities of Spearman's rho. Journal of the Royal Statistical Society. Series C (Applied Statistics) 24, 3 (1975), 377--379.Google ScholarCross Ref
- Jiyao Chen, Richard R Reilly, and Gary S Lynn. 2005. The impacts of speed-to-market on new product success: the moderating effects of uncertainty. IEEE Trans. Eng. Manage. 52, 2 (2005), 199--212.Google ScholarCross Ref
- Morakot Choetkiertikul, Hoa Khanh Dam, Truyen Tran, and Aditya Ghose. 2015. Predicting Delays in Software Projects Using Networked Classification (T). In Automated Software Engineering (ASE), 2015 30th IEEE/ACM International Conference on. IEEE, 353--364.Google ScholarDigital Library
- Morakot Choetkiertikul, Hoa Khanh Dam, Truyen Tran, and Aditya Ghose. 2017. Predicting the delay of issues with due dates in software projects. Empirical Software Engineering Journal (2017), 1--41. Google ScholarDigital Library
- Norman Cliff. 1993. Dominance statistics: Ordinal analyses to answer ordinal questions. Psychological Bulletin 114, 3 (1993), 494.Google ScholarCross Ref
- Kevin Crowston, Hala Annabi, and James Howison. 2003. Defining open source software project success. ICIS 2003 Proceedings (2003), 28.Google Scholar
- Daniel Alencar da Costa, Surafel Lemma Abebe, Shane McIntosh, Uirá Kulesza, and Ahmed E Hassan. 2014. An empirical study of delays in the integration of addressed issues. In Software Maintenance and Evolution (ICSME), 2014 IEEE International Conference on. IEEE, 281--290. Google ScholarDigital Library
- Daniel Alencar da Costa, Shane McIntosh, Uirá Kulesza, and Ahmed E. Hassan. 2016. The Impact of Switching to a Rapid Release Cycle on the Integration Delay of Addressed Issues: An Empirical Study of the Mozilla Firefox Project. In Proceedings of the 13th International Conference on Mining Software Repositories (MSR '16). ACM, New York, NY, USA, 374--385. Google ScholarDigital Library
- Adam Debbiche, Mikael Dienér, and Richard Berntsson Svensson. 2014. Challenges when adopting continuous integration: A case study. In International Conference on Product-Focused Software Process Improvement. Springer, 17--32.Google ScholarCross Ref
- Paul Duvall, Stephen M Matyas, and Andrew Glover. 2007. Continuous Integration: Improving Software Quality and Reducing Risk (The Addison-Wesley Signature Series). Addison-Wesley Professional. Google ScholarDigital Library
- Martin Fowler and Matthew Foemmel. 2006. Continuous integration. ThoughtWorks) http://www.thoughtworks.com/ContinuousIntegration.pdf (2006), 122.Google Scholar
- Emanuel Giger, Martin Pinzger, and Harald Gall. 2010. Predicting the fix time of bugs. In Proceedings of the 2nd International Workshop on Recommendation Systems for Software Engineering. ACM, 52--56. Google ScholarDigital Library
- Georgios Gousios, Martin Pinzger, and Arie van Deursen. 2014. An exploratory study of the pull-based software development model. In Proceedings of the 36th International Conference on Software Engineering. 345--355. Google ScholarDigital Library
- Georgios Gousios and Diomidis Spinellis. 2012. GHTorrent: GitHub's data from a firehose. In Mining software repositories (msr), 2012 9th ieee working conference on. 12--21. Google ScholarDigital Library
- Georgios Gousios, Andy Zaidman, Margaret-Anne Storey, and Arie Van Deursen. 2015. Work practices and challenges in pull-based development: the integrator's perspective. In Proceedings of the 37th International Conference on Software Engineering-Volume 1. 358--368. Google ScholarDigital Library
- Frank Harrell. 2015. Regression modeling strategies: with applications to linear models, logistic and ordinal regression, and survival analysis. Springer.Google Scholar
- Michael Hilton, Timothy Tunnell, Kai Huang, Darko Marinov, and Danny Dig. 2016. Usage, costs, and benefits of continuous integration in open-source projects. In Proceedings of the 31st IEEE/ACM International Conference on Automated Software Engineering - ASE 2016. Google ScholarDigital Library
- Gareth James, Daniela Witten, Trevor Hastie, and Robert Tibshirani. 2014. An Introduction to Statistical Learning: With Applications in R. Springer Publishing Company, Incorporated. Google Scholar
- Yujuan Jiang, Bram Adams, and Daniel M German. 2013. Will my patch make it? and how fast? case study on the linux kernel. In Mining Software Repositories (MSR), 2013 10th IEEE Working Conference on. IEEE, 101--110. Google ScholarDigital Library
- Teemu Karvonen, Woubshet Behutiye, Markku Oivo, and Pasi Kuvaja. 2017. Systematic literature review on the impacts of agile release engineering practices. Information and Software Technology 86 (2017), 87 -- 100. Google ScholarDigital Library
- Eero Laukkanen, Maria Paasivaara, and Teemu Arvonen. 2015. Stakeholder Perceptions of the Adoption of Continuous Integration - A Case Study. In Proceedings of the 2015 Agile Conference (AGILE '15). IEEE Computer Society, 11--20. Google ScholarDigital Library
- Shane Mcintosh, Yasutaka Kamei, Bram Adams, and Ahmed E. Hassan. 2016. An Empirical Study of the Impact of Modern Code Review Practices on Software Quality. Empirical Softw. Engg. 21, 5 (2016), 2146--2189. Google ScholarDigital Library
- Mathias Meyer. 2014. Continuous integration and its tools. IEEE Softw. 31, 3 (2014), 14--16.Google ScholarCross Ref
- Nachiappan Nagappan and Thomas Ball. 2005. Use of relative code churn measures to predict system defect density. In Software Engineering, 2005. ICSE 2005. Proceedings. 27th International Conference on. IEEE, 284--292. Google ScholarDigital Library
- Dewayne E. Perry, Adam A. Porter, and Lawrence G. Votta. 2000. Empirical Studies of Software Engineering: A Roadmap. In Proceedings of the Conference on The Future of Software Engineering (ICSE '00). ACM, 345--355. Google ScholarDigital Library
- J. Romano, J.D. Kromrey, J. Coraggio, and J. Skowronek. 2006. Appropriate statistics for ordinal level data: Should we really be using t-test and Cohen'sd for evaluating group differences on the NSSE and other surveys?. In annual meeting of the Florida Association of Institutional Research. 1--3.Google Scholar
- Adrian Schroter, Adrian Schröter, Nicolas Bettenburg, and Rahul Premraj. 2010. Do stack traces help developers fix bugs?. In Mining Software Repositories (MSR), 2010 7th IEEE Working Conference on. IEEE, 118--121.Google ScholarCross Ref
- Ken Schwaber. 1997. SCRUM Development Process. In Business Object Design and Implementation, Dr Jeff Sutherland, Cory Casanave, Joaquin Miller, Dr Philip Patel, and Glenn Hollowell (Eds.). Springer London, 117--134.Google Scholar
- Emad Shihab, Akinori Ihara, Yasutaka Kamei, Walid M Ibrahim, Masao Ohira, Bram Adams, Ahmed E Hassan, and Ken-ichi Matsumoto. 2010. Predicting reopened bugs: A case study on the eclipse project. In Reverse Engineering (WCRE), 2010 17th Working Conference on. IEEE, 249--258. Google ScholarDigital Library
- Daniel Ståhl and Jan Bosch. 2014. Modeling Continuous Integration Practice Differences in Industry Software Development. J. Syst. Softw. 87 (2014), 48--59. Google ScholarDigital Library
- Bogdan Vasilescu, Stef Van Schuylenburg, Jules Wulms, Alexander Serebrenik, and Mark GJ van den Brand. 2014. Continuous integration in a social-coding world: Empirical evidence from GitHub. In Software Maintenance and Evolution (ICSME), 2014 IEEE International Conference on. IEEE, 401--405. Google ScholarDigital Library
- Bogdan Vasilescu, Yue Yu, Huaimin Wang, Premkumar Devanbu, and Vladimir Filkov. 2015. Quality and productivity outcomes relating to continuous integration in GitHub. In Proceedings of the 2015 10th Joint Meeting on Foundations of Software Engineering - ESEC/FSE 2015. Google ScholarDigital Library
- Daniel S Wilks. 2011. Statistical methods in the atmospheric sciences. Vol. 100. Academic press.Google Scholar
- David F. Williamson, Robert A. Parker, and Juliette S. Kendrick. 1989. The box plot: A simple visual method to interpret data. Annals of Internal Medicine 110 (1989), 916--921.Google ScholarCross Ref
- Krzysztof Wnuk, Tony Gorschek, and Showayb Zahda. 2013. Obsolete software requirements. Information and Software Technology 55, 6 (2013), 921--940. Google ScholarDigital Library
- Claes Wohlin, Min Xie, and Magnus Ahlgren. 1995. Reducing time to market through optimization with respect to soft factors. In The Engineering Management Conference. 116--121.Google ScholarCross Ref
- Yue Yu, Gang Yin, Tao Wang, Cheng Yang, and Huaimin Wang. 2016. Determinants of pull-based development in the context of continuous integration. Sci. China Inf. Sci. 59, 8 (2016).Google Scholar
- Yangyang Zhao, Alexander Serebrenik, Yuming Zhou, Vladimir Filkov, and Bogdan Vasilescu. 2017. The impact of continuous integration on other software development practices: a large-scale empirical study. In Proceedings of the 32nd IEEE/ACM International Conference on Automated Software Engineering. IEEE Press, 60--71. Google ScholarCross Ref
- Studying the impact of adopting continuous integration on the delivery time of pull requests
Recommendations
An exploratory study of the pull-based software development model
ICSE 2014: Proceedings of the 36th International Conference on Software EngineeringThe advent of distributed version control systems has led to the development of a new paradigm for distributed software development; instead of pushing changes to a central repository, developers pull them from other repositories and merge them ...
The impact of a continuous integration service on the delivery time of merged pull requests
AbstractContinuous Integration (CI) is a software development practice that builds and tests software frequently (e.g., at every push). One main motivator to adopt CI is the potential to deliver software functionalities more quickly than not using CI. ...
Rugby: an agile process model based on continuous delivery
RCoSE 2014: Proceedings of the 1st International Workshop on Rapid Continuous Software EngineeringIn this paper we introduce Rugby, an agile process model that includes workflows for the continuous delivery of software. It allows part-timers to work in a project-based organization with multiple projects for the rapid delivery of prototypes and ...
Comments