Skip to main content
Log in

A study of build inflation in 30 million CPAN builds on 13 Perl versions and 10 operating systems

  • Published:
Empirical Software Engineering Aims and scope Submit manuscript

Abstract

Continuous Integration (CI) is a cornerstone of modern quality assurance, providing on-demand builds (compilation and tests) of code changes or software releases. Yet the many existing CI systems do not help developers in interpreting build results, in particular when facing build inflation. Build inflation arises when each code change has to be built on dozens of combinations (configurations) of runtime environments (REs), operating systems (OSes), and hardware architectures (HAs). A code change C1 sent to the CI system may introduce programming faults that result in all these builds to fail, while a change C2 introducing a new library dependency might only lead one particular build configuration to fail. Consequently, the one build failure due to C2 will be “hidden” among the dozens of build failures due to C1 when the CI system reports the results of the builds. We have named this phenomenon build inflation, because it may bias the interpretation of build results by developers by “hiding” certain types of faults.

In this paper, we study build inflation through a large-scale study of the relationship between REs and OSes and build failures on 30 million builds of the CPAN repository on the CPAN Testers package-level CI system. We show that the builds of Perl packages may fail differently on different REs and OSes and any combination thereof . Thus, we show that the results provided by CPAN Testers require filtering and selection to identify real trends of build failures among the many failures. Manual analysis of 791 build failures shows that dependency faults (missing modules) and programming faults (undefined values) are the main reasons for failures, with dependency faults being easier to fix. We conclude with recommendations for practitioners and researchers in interpreting build results as well as for tool builders who should improve he scheduling of builds and the reporting of build failures.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14

Similar content being viewed by others

Notes

  1. http://www.cpan.org

  2. In this paper, we use the term “package” in its usual sense while Perl developers talk about “distribution”.

  3. Most of the vectors did not contain any build failure, which is expected.

  4. http://www.ptidej.net/downloads/replications/emse19b

  5. The models are not useful to predict build failures in practice because they only include OSes and REs and ignore other factors. However, they are useful to validate the extent to which OSes and REs alone explain build failures, i.e., to validate the strength of the link between build configurations and build failures.

References

  • Adams B, McIntosh S (2016) Modern release engineering in a nutshell – why researchers should care. In: Proceedings of the 23rd IEEE international conference on software analysis, evolution, and reengineering, Leaders of tomorrow: future of software engineering (SANER). Osaka, Japan

  • Adams B, Tromp H, De Schutter K, De Meuter W (2007) Design recovery and maintenance of build systems. In: 2007 IEEE international conference on software maintenance. IEEE, pp 114–123

  • Adams B, De Schutter K, Tromp H, De Meuter W (2008) The evolution of the linux build system. Electronic Communications of the EASST, vol 8. https://journal.ub.tuberlin.de/eceasst/article/view/115/0

  • Allende Esteban, Fabry Johan, Garcia Ronald, Tanter É (2014) Confined gradual typing. In: Proceedings of the 2014 ACM international conference on object oriented programming systems languages & applications, OOPSLA ’14, Portland, Oregon, USA. ACM, New York, pp 251–270. https://doi.org/10.1145/2660193.2660222. http://doi.acm.org/10.1145/2660193.2660222

  • Anderson C, Giannini P, Drossopoulou S (2005) Towards Type Inference for Javascript. In: Proceedings of the 19th European conference on object-oriented programming, ECOOP’05, Glasgow, UK. Springer, Berlin, pp 428–452. https://doi.org/10.1007/11531142_19

    Chapter  Google Scholar 

  • Atlee C (2017) What happens when you push - 2012 edition. https://atlee.ca/blog/posts/blog20120113what-happens-when-you-push-2012-edition. Accessed 07 March 2017

  • Bass L, Weber I, Zhu L (2015) Devops: a software architect’s perspective, 1st. Addison-Wesley Professional, Reading

    Google Scholar 

  • Bell J, Legunsen O, Hilton M, Eloussi L, Yung T, Marinov D (2018) Deflaker: automatically detecting flaky tests. In: 40th international conference on software engineering (ICSE). [Online]. Available: https://doi.org/10.1145/3180155.3180164, pp 433–444

  • Beller M, Gousios G, Zaidman A (2017) Oops, my tests broke the build: an explorative analysis of travis ci with github. In: Proceedings of the 14th international conference on mining software repositories. IEEE Press, pp 356–367

  • Booch G (1994) Object-oriented analysis and design with applications, 2nd. Benjamin-Cummings Publishing Co. Inc., Redwood City

    MATH  Google Scholar 

  • Bouckaert RR, Frank E, Hall M, Kirkby R, Reutemann P, Seewald A, Scuse D (2010) Weka manual for version 3-7-3, The University of WAIKATO

  • Bracha G, Griswold D (1993) Strongtalk: typechecking smalltalk in a production environment. In: Proceedings of the eighth annual conference on object-oriented programming systems, languages, and applications, ser OOPSLA ’93. ACM, New York, pp 215–230. https://doi.org/10.1145/165854.165893. http://doi.acm.org/10.1145/165854.165893

  • Calle ML, Urrea V, Boulesteix A-L, Malats N (2011) Auc-rf: a new strategy for genomic profiling with random forest. Hum Hered 72(2):121–132

    Article  Google Scholar 

  • Campbell JL, Quincy C, Osserman J, Pedersen OK (2013) Coding in-depth semistructured interviews: Problems of unitization and intercoder reliability and agreement. Sociol Methods Res 42(3):294–320. [Online]. Available: https://doi.org/10.1177/0049124113500475

    Article  MathSciNet  Google Scholar 

  • Carrez T (2015) Preventing craziness: a deep dive into OpenStack testing automation. Presentation at FOSDEM, Feb 2014

  • Chaudhuri A, Vekris P, Goldman S, Roch M, Levi G (2017) Fast and precise type checking for JavaScript. Proc. ACM Program. Lang. 1(OOPSLA):48:1–48:30. https://doi.org/10.1145/3133872. http://doi.acm.org/10.1145/3133872

    Article  Google Scholar 

  • CPAN Comprehensive Perl Archive Network (2015). http://www.cpan.org. Accessed 22 Dec 2015

  • CPAN Testers (2015) http://www.cpantesters.org. Accessed 22 Dec 2015

  • mailto:cpan@perl.org (2016) PerlSource versions and release date, accessed: 2016-11-01. [Online]. Available: http://www.cpan.org/src/

  • Denny P, Luxton-Reilly A, Tempero E (2012) All syntax errors are not equal. In: Proceedings of the 17th ACM annual conference on Innovation and technology in computer science education. ACM, pp 75–80

  • DeRemer F, Kron H (1975) Programming-in-the large versus programming-in-the-small. In: Proceedings of the international conference on reliable software, Los Angeles, California. ACM, New York, pp 114–121. https://doi.org/10.1145/800027.808431. http://doi.acm.org/10.1145/800027.808431

  • Duvall P, Matyas SM (2007) A glover continuous integration: improving software quality and reducing risk (The Addison-Wesley Signature Series). Addison-Wesley Professional, Reading

    Google Scholar 

  • Dyke G (2011) Which aspects of novice programmers’ usage of an ide predict learning outcomes. In: Proceedings of the 42nd ACM technical symposium on Computer science education. ACM, pp 505–510

  • Feldman SI (1979) Make a program for maintaining computer programs. Software: Practice and Experience 9(4):255–265

    MATH  Google Scholar 

  • Fowler M, Foemmel M (2006) Continuous integration, Thought-Works). http://www.thoughtworks.com/ContinuousIntegration.pdf

  • Gallaba K, Macho C, Pinzger M, McIntosh S (2018) Noise and heterogeneity in historical build data: an empirical study of travis ci. In: 33rd ACM/IEEE international conference on automated software engineering (ASE), pp 87–97

  • Gao Z, Bird C, Barr ET (2017) To type or not to type: quantifying detectable bugs in javascript. In: Proceedings of the 39th international conference on software engineering, ser. ICSE ’17. Piscataway, IEEE Press, pp 758–769. [Online]. Available: https://doi.org/10.1109/ICSE.2017.75

  • Glatard T, Lewis LB, Ferreira da Silva R, Adalat R, Beck N, Lepage C, Rioux P, Rousseau M-E, Sherif T, Deelman E, Khalili-Mahani N, Evans AC (2015) Reproducibility of neuroimaging analyses across operating systems. Frontiers in Neuroinformatics 9:12. https://doi.org/10.3389/fninf.2015.00012. https://www.frontiersin.org/article/10.3389/fninf.2015.00012

    Article  Google Scholar 

  • Hassan AE, Zhang K (2006) Using decision trees to predict the certification result of a build. In: 2006. ASE’06 21st IEEE/ACM international conference on automated software engineering. IEEE, pp 189–198

  • Hassan F, Wang X (2018) Hirebuild: an automatic approach to history-driven repair of build scripts. In: 40th international conference on software engineering (ICSE), pp 1078–1089

  • Hilton M, Tunnell T, Huang K, Marinov D, Dig D (2016) Usage, costs, and benefits of continuous integration in open-source projects. In: 2016 31st IEEE/ACM international conference on automated software engineering (ASE). IEEE, pp 426–437

  • Humble J, Farley D (2010) Continuous delivery: reliable software releases through build, test, and deployment automation, 1st. Addison-Wesley Professional, Reading

    Google Scholar 

  • Kerzazi N, Khomh F, Adams B (2014) Why do automated builds break? an empirical study. In: 2014 IEEE international conference on software maintenance and evolution (ICSME). IEEE, pp 41–50

  • Kruchten P (1995) The 4 + 1 view model of architecture. IEEE Softw 12(6):42–50. [Online]. Available: https://doi.org/10.1109/52.469759

    Article  Google Scholar 

  • Labuschagne A, Inozemtseva L, Holmes R (2017) Measuring the cost of regression testing in practice: a study of java projects using continuous integration. In: Proceedings of the 2017 11th joint meeting on foundations of software engineering, ser. ESEC/FSE 2017. ACM, New York, pp 821–830. [Online]. Available: https://doi.org/10.1145/3106237.3106288

  • Laukkanen E, Paasivaara M, Arvonen T (2015) Stakeholder perceptions of the adoption of continuous integration–a case study. In: 2015 Agile conference (AGILE). IEEE, pp 11–20

  • Lehman MM (1996) Laws of software evolution revisited. In: European workshop on software process technology. Springer, pp 108–124

  • Leppänen M, Mäkinen S, Pagels M, Eloranta V-P, Itkonen J, Mäntylä MV, Männistö T (2015) The highways and country roads to continuous deployment. IEEE Softw 32(2):64–72

    Article  Google Scholar 

  • Luo Q, Hariri F, Eloussi L, Marinov D (2014) An empirical analysis of flaky tests. In: 22nd ACM SIGSOFT International symposium on foundations of software engineering (FSE), pp 643–653

  • Macho C, McIntosh S, Pinzger M (2018) Automatically repairing dependency-related build breakage. In: International conference on software analysis, evolution, and reengineering (SANER

  • McIntosh S, Adams B, Hassan AE (2010) The evolution of ant build systems. In: 2010 7th IEEE working conference on mining software repositories (MSR 2010). IEEE, pp 42–51

  • McIntosh S, Adams B, Kamei Y, Nguyen T, Hassan AE (2011) An empirical study of build maintenance effort. In: Proceedings of the 33rd international conference on software engineering (ICSE), Waikiki, Honolulu, Hawaii, pp 141–150

  • McIntosh S, Nagappan M, Adams B, Mockus A, Hassan AE (2015) A large-scale empirical study of the relationship between build technology and build maintenance. Empir Softw Eng 20(6):1587–1633

    Article  Google Scholar 

  • MetaCPAN API (2016) https://github.com/metacpan/metacpan-api. Accessed 07 Dec 2016

  • Micco J (2016) Continuous integration at google scale. https://www.slideshare.net/JohnMicco1/2016-0425-continuous-integration-at-google-scale

  • Miller A (2008) A hundred days of continuous integration. In: AGILE’08 conference Agile 2008. IEEE, pp 289–293

  • Mirhosseini S, Parnin C (2017) Can automated pull requests encourage software developers to upgrade out-of-date dependencies?. In: Proceedings of the 32Nd IEEE/ACM international conference on automated software engineering, ser. ASE 2017. Piscataway, IEEE Press, pp 84–94. [Online]. Available: http://dl.acm.org/citation.cfm?id=3155562.3155577

  • O’Duinn J (2013) The financial cost of a checkin (part 2). https://oduinn.com/2013/12/13/the-financial-cost-of-a-checkin-part-2/

  • Openstack Zuul CI Dashboard (2014) http://zuul.openstack.org

  • Palomba F, Zaidman A (2017) Does refactoring of test smells induce fixing flaky tests?. In: 2017 IEEE international conference on software maintenance and evolution (ICSME), pp 1–12

  • Raemaekers S, van Deursen A, Visser J (2012) Measuring software library stability through historical version analysis. In: 2012 28th IEEE international conference on software maintenance (ICSM). IEEE, pp 378–387

  • Rausch T, Hummer W, Leitner P, Schulte S (2017) An empirical analysis of build failures in the continuous integration workflows of java-based open-source software. In: Proceedings of the 14th International Conference on Mining Software Repositories. IEEE Press, pp 345–355

  • Rogers RO (2004) Scaling continuous integration. In: International conference on extreme programming and Agile processes in software engineering. Springer, pp 68–76

  • Seo H, Sadowski C, Elbaum S, Aftandilian E, Bowdidge R (2014) Programmers’ build errors: a case study (at Google). In: Proceedings of the 36th international conference on software engineering. ACM, pp 724–734

  • Ståhl D, Bosch J (2014) Modeling continuous integration practice differences in industry software development. J Syst Softw 87:48–59

    Article  Google Scholar 

  • Suvorov R, Nagappan M, Hassan AE, Zou Y, Adams B (2012) An empirical study of build system migrations in practice: case studies on kde and the linux kernel. In: IEEE, pp 160–169

  • TreeHerder (2017) https://treeherder.mozilla.org/#/jobs?repo=mozilla-inbound. Accessed 20 Sept 2017

  • Tu Q, Godfrey MW (2001) The build-time software architecture view. In: Proceedings of the IEEE international conference on software maintenance (ICSM’01), ser. ICSM ’01. Washington, DC, USA: IEEE Computer Society, p 398. [Online]. Available: https://doi.org/10.1109/ICSM.2001.972753

  • Vasilescu B, Yu Y, Wang H, Devanbu P, Filkov V (2015) Quality and productivity outcomes relating to continuous integration in github. In: Proceedings of the 2015 10th joint meeting on foundations of software engineering. ACM, pp 805–816

  • Vassallo C, Schermann G, Zampetti F, Romano D, Leitner P, Zaidman A, Penta MD, Panichella S (2017) A tale of ci build failures: an open source and a financial organization perspective. In: 2017 IEEE international conference on software maintenance and evolution (ICSME), pp 183–193

  • Wikipedia (2008) List of build automation software. https://en.wikipedia.org/wiki/List_of_build_automation_software. Accessed 14 May 2019

  • Wohlin C, Runeson P, Höst M, Ohlsson MC, Regnell B, Wesslén A (2000) Experimentation in software engineering: an introduction. Kluwer Academic Publishers, Norwell

    Book  Google Scholar 

  • Yoo S, Harman M (2012) Regression testing minimization, selection and prioritization: a survey. Softw Test Verif Reliab 22(2):67–120. [Online]. Available: https://doi.org/10.1002/stv.430

    Article  Google Scholar 

  • Zhao Y, Serebrenik A, Zhou Y, Filkov V, Vasilescu B (2017) The impact of continuous integration on other software development practices: a large-scale empirical study. In: Proceedings of the 32Nd IEEE/ACM International Conference on Automated Software Engineering, ser. ASE 2017. Piscataway, IEEE Press, pp 60–71. [Online]. Available: http://dl.acm.org/citation.cfm?id=3155562.3155575

  • Ziftci C, Reardon J (2017) Who broke the build?: automatically identifying changes that induce test failures in continuous integration at Google scale. In: Proceedings of the 39th international conference on software engineering: software engineering in practice track, ser. ICSE-SEIP ’17. IEEE Press, Piscataway, pp 113–122. [Online]. Available: https://doi.org/10.1109/ICSE-SEIP.2017.13

  • Zolfagharinia M, Adams B, Guéhéneuc Y-G (2017) Do not trust build results at face value: an empirical study of 30 million CPAN builds. In: Proceedings of the 14th international conference on mining software repositories. IEEE Press, pp 312–322

Download references

Acknowledgements

Part of this work was funded by the NSERC Discovery Grant and Canada Research Chair programs.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mahdis Zolfagharinia.

Additional information

Communicated by: Romain Robbes

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zolfagharinia, M., Adams, B. & Guéhéneuc, YG. A study of build inflation in 30 million CPAN builds on 13 Perl versions and 10 operating systems. Empir Software Eng 24, 3933–3971 (2019). https://doi.org/10.1007/s10664-019-09709-6

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10664-019-09709-6

Keywords

Navigation