Skip to main content
Log in

A multi-dimensional analysis of technical lag in Debian-based Docker images

  • Published:
Empirical Software Engineering Aims and scope Submit manuscript

Abstract

Container-based solutions, such as Docker, have become increasingly relevant in the software industry to facilitate deploying and maintaining software systems. Little is known, however, about how outdated such containers are at the moment of their release or when used in production. This article addresses this question, by measuring and comparing five different dimensions of technical lag that Docker container images can face: package lag, time lag, version lag, vulnerability lag, and bug lag. We instantiate the formal technical lag framework from previous work to operationalise these different dimensions of lag on Docker Hub images based on the Debian Linux distribution. We carry out a large-scale empirical study of such technical lag, over a three-year period, in 140,498 Debian images. We compare the differences between official and community images, as well as between images with different Debian distributions: OldStable, Stable or Testing. The analysis shows that the different dimensions of technical lag are complementary, providing multiple insights. Official Debian images consistently have a lower lag than community images for all considered lag dimensions. The amount of lag incurred depends on the type of Debian distribution and the considered lag dimension. Our research offers empirical evidence that developers and deployers of Docker images can benefit from identifying to which extent their containers are outdated according to the considered dimensions, and mitigate the risks related to such outdatedness.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Listing 1
Listing 2
Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18

Similar content being viewed by others

Notes

  1. https://github.com/shogun-toolbox/shogun/blob/develop/configs/shogun-sdk/Dockerfile

  2. https://docs.npmjs.com/cli/outdated.html

  3. https://www.mojohaus.org/versions-maven-plugin/

  4. https://david-dm.org/

  5. https://guides.rubygems.org/command-reference/#gem-outdated

  6. https://github.com/containrrr/watchtower

  7. https://github.com/pyouroboros/ouroboros

  8. https://github.com/neglectos/ConPan

  9. An example of rule violation is forgetting the -y flag when using apt-get install.

  10. Certified images are built with best practices, tested and validated against the Docker Enterprise Edition and pass security requirements.

  11. Verified images are high-quality images from verified publishers. These products are published and maintained directly by a commercial entity.

  12. https://registry.hub.docker.com/v2/repositories/library/debian/

  13. https://hub.docker.com/_/debian

  14. Downloading all available images would have taken at least 6 extra months, and would have required considerably more storage capacity.

  15. https://snapshot.debian.org/archive/debian/ and snapshot.debian.org/archive/debian-security/

  16. https://security-tracker.debian.org/tracker/data/json

  17. https://cve.mitre.org/cve/

  18. https://nvd.nist.gov

  19. https://udd.debian.org/bugs/

  20. https://www.debian.org/Bugs/Developer

  21. https://www.debian.org/doc/debian-policy/ch-controlfields.html#version

  22. If n different tests are carried out over the same dataset, for each individual test one can only reject H0 if \(p< \frac {0.01}{n}\). In our case n = 28, i.e., p < 0.00036.

  23. Extra analysis and results, distinguishing the evolution trends both for official and community images, can be found in our reproduction package.

  24. https://www.debian.org/News/2016/20160917

  25. https://github.com/docker-library/official-images/commit/a0884f0cd8758a0a30cf187f25ef217e3915979f

  26. https://www.debian.org/News/2017/20170114

  27. https://github.com/docker-library/official-images/commit/fbbcd34e82dcea6e75f5a5ea465d49912d996261

  28. https://www.debian.org/News/2017/index.en.html

  29. https://www.debian.org/News/2018/20180310

  30. An example of this was provided with the Dockerfile for the community image shogun-dev:latest presented in Section 2.2.

References

  • Abate P, Di Cosmo R, Boender J, Zacchiroli S (2009) Strong dependencies between software components. In: International symposium on empirical software engineering and measurement. https://doi.org/10.1109/ESEM.2009.5316017. IEEE Computer Society, pp 89–99

  • Abate P, Di Cosmo R, Treinen R, Zacchiroli S (2012) Dependency solving: a separate concern in component evolution management. J Syst Softw 85 (10):2228–2240. https://doi.org/10.1016/j.jss.2012.02.018

    Article  Google Scholar 

  • Abate P, Di Cosmo R, Treinen R, Zacchiroli S (2014) Learning from the future of component repositories. Sci Comput Program 90:93–115. https://doi.org/10.1016/j.scico.2013.06.007

    Article  Google Scholar 

  • Anchore.io (2017) Snapshot of the container ecosystem. https://anchore.com/wp-content/uploads/2017/04/Anchore-Container-Survey-5.pdf. Accessed: 01/12/2019

  • Artho C, Suzaki K, Di Cosmo R, Treinen R, Zacchiroli S (2012) Why do software packages conflict?. In: Working conference mining software repositories. https://doi.org/10.1109/MSR.2012.6224274, pp 141–150

  • Bernstein D (2014) Containers and cloud: from LXC to Docker to Kubernetes. IEEE Cloud Comput 1(3):81–84. https://doi.org/10.1109/MCC.2014.51

    Article  Google Scholar 

  • Bettini A (2015) Vulnerability exploitation in docker container environments. In: FlawCheck, Black Hat Europe

  • Boettiger C (2015) An introduction to Docker for reproducible research. ACM SIGOPS Oper Syst Rev 49(1):71–79. https://doi.org/10.1145/2723872.2723882

    Article  Google Scholar 

  • Cito J, Schermann G, Wittern JE, Leitner P, Zumberi S, Gall HC (2017) An empirical analysis of the Docker container ecosystem on GitHub. In: International conference on mining software repositories. https://doi.org/10.1109/MSR.2017.67. IEEE Press, pp 323–333

  • Claes M, Mens T, Di Cosmo R, Vouillon J (2015) A historical analysis of Debian package incompatibilities. In: Working conference mining software repositories. https://doi.org/10.1109/MSR.2015.27, pp 212–223

  • Cogo F R, Oliva G A, Hassan A E (2019) An empirical study of dependency downgrades in the npm ecosystem. IEEE Trans Softw Eng. https://doi.org/10.1109/TSE.2019.2952130

  • Combe T, Martin A, Di Pietro R (2016) To Docker or not to Docker: a security perspective. IEEE Cloud Comput 3(5):54–62. https://doi.org/10.1109/MCC.2016.100

    Article  Google Scholar 

  • Cox J, Bouwers E, van Eekelen M, Visser J (2015) Measuring dependency freshness in software systems. In: International conference on software engineering. https://doi.org/10.1109/ICSE.2015.140. IEEE Press, pp 109–118

  • de Visser M (2017) A look at how often Docker images are updated. https://anchore.com/look-often-docker-images-updated/. Accessed: 20 August 2020

  • Decan A, Mens T, Constantinou E (2018a) On the evolution of technical lag in the npm package dependency network. In: International conference software maintenance and evolution. https://doi.org/10.1109/ICSME.2018.00050. IEEE, pp 404–414

  • Decan A, Mens T, Constantinou E (2018b) On the impact of security vulnerabilities in the npm package dependency network. In: International conference on mining software repositories. https://doi.org/10.1145/3196398.3196401

  • Decan A, Mens T, Grosjean P (2019) An empirical comparison of dependency network evolution in seven software packaging ecosystems. Empir Softw Eng 24(1):381–416. ISSN 1573-7616. https://doi.org/10.1007/s10664-017-9589-y

    Article  Google Scholar 

  • DeHamer B (2020) Docker hub top 10. https://www.ctl.io/developers/blog/post/docker-hub-top-10/. Accessed: 20 August 2020

  • Docker Inc. (2020a) Docker registry HTTP API V2. https://docs.docker.com/registry/spec/api/. Accessed: 20 Aug 2020

  • Docker Inc. (2020b) Dockerfile reference. https://docs.docker.com/engine/reference/builder/. Accessed: 20 August 2020

  • Gonzalez-Barahona JM, Robles G, Michlmayr M, Amor JJ, German DM (2009) Macro-level software evolution: a case study of a large software compilation. Empir Softw Eng 14(3):262–285. https://doi.org/10.1007/s10664-008-9100-x

    Article  Google Scholar 

  • Gonzalez-Barahona JM, Sherwood P, Robles G, Izquierdo D (2017) Technical lag in software compilations: measuring how outdated a software deployment is. In: IFIP international conference on open source systems. https://doi.org/10.1007/978-3-319-57735-7_17. Springer, pp 182–192

  • Henkel J, Bird C, Lahiri SK, Reps T (2020) Learning from, understanding, and supporting DevOps artifacts for Docker. In: International conference on software engineering

  • Kula R G, German D M, Ishio T, Inoue K (2015) Trusting a library: a study of the latency to adopt the latest Maven release. In: International conference on software analysis, evolution, and reengineering. https://doi.org/10.1109/SANER.2015.7081869, pp 520–524

  • Kula RG, German DM, Ouni A, Ishio T, Inoue K (2017) Do developers update their library dependencies? Empir Softw Eng 23(1):384–417. https://doi.org/10.1007/s10664-017-9521-5. ISSN 1573-7616

    Article  Google Scholar 

  • Kwon S, Lee J-H (2020) Divds: Docker image vulnerability diagnostic system. IEEE Access. https://doi.org/10.1109/ACCESS.2020.2976874

  • Legay D, Decan A, Mens T (2020) On package freshness in Linux distributions. In: International conference software maintenance and evolution—NIER Track

  • Lu Z, Xu J, Wu Y, Wang T, Huang T (2019) An empirical case study on the temporary file smell in Dockerfiles. IEEE Access. https://doi.org/10.1109/ACCESS.2019.2905424

  • Merkel D (2014) Docker: lightweight Linux containers for consistent development and deployment. Linux J 2014(239):2

    Google Scholar 

  • Mezzetti G, Møller A, Torp MT (2018) Type regression testing to detect breaking changes in Node. js libraries. In: European conference on object-oriented programming. https://doi.org/10.4230/LIPIcs.ECOOP.2018.7

  • Møller A, Torp M T (2019) Model-based testing of breaking changes in Node.js libraries. In: Joint meeting on European software engineering conference and symposium on the foundations of software engineering. https://doi.org/10.1145/3338906.3338940. ACM, pp 409–419

  • Mouat A (2015) Using docker: developing and deploying software with containers. O’Reilly Media, Inc.

  • Nussbaum L, Zacchiroli S (2010) The ultimate Debian database: consolidating bazaar metadata for quality assurance and data mining. In: Working conference on mining software repositories. https://doi.org/10.1109/MSR.2010.5463277, pp 52–61

  • Romano J, Kromrey JD, Coraggio J, Skowronek J, Devine L (2006) Exploring methods for evaluating group differences on the NSSE and other surveys: are the t-test and Cohen’s d indices the most appropriate choices?. In: Annual meeting of the southern association for institutional research

  • Salza P, Palomba F, Di Nucci D, De Lucia A, Ferrucci F (2020) Third-party libraries in mobile apps: when, how, and why developers update them. Empir Softw Eng 25:2341–2377. https://doi.org/10.1007/s10664-019-09754-1

    Article  Google Scholar 

  • Shu R, Gu X, Enck W (2017) A study of security vulnerabilities on Docker Hub. In: International conference on data and application security and privacy. https://doi.org/10.1145/3029806.3029832. ACM, pp 269–280

  • Socchi E, Luu J (2019) A deep dive into Docker Hub’s security landscape—a story of inheritance? Master’s thesis University of Oslo

  • The Debian GNU/Linux FAQ (2019) The Debian package management tools. https://www.debian.org/doc/manuals/debian-faq/pkgtools.en.html. Accessed: 20 Aug 2020

  • Turnbull J (2014) The Docker book: containerization is the new virtualization. James Turnbull

  • Vermeer B, Henry W (2019) Shifting Docker security left. https://snyk.io/blog/shifting-docker-security-left/. Accessed: 02/11/2019

  • Vouillon J, Di Cosmo R (2011) On software component co-installability. In: Joint European software engineering conference and ACM SIGSOFT international symposium on foundations of software engineering. https://doi.org/10.1145/2025113.2025149

  • Wohlin C, Runeson P, Host M, Ohlsson MC, Regnell B, Wesslen A (2000) Experimentation in software engineering—an introduction. Kluwer, Boston. https://doi.org/10.1007/978-1-4615-4625-2

    Book  Google Scholar 

  • Zapata RE, Kula RG, Chinthanet B, Ishio T, Matsumoto K, Ihara A (2018) Towards smoother library migrations: a look at vulnerable dependency migrations at function level for npm JavaScript packages. In: International conference on software maintenance and evolution. https://doi.org/10.1109/ICSME.2018.00067. IEEE, pp 559–563

  • Zerouali A (2019) A measurement framework for analyzing technical lag in open-source software ecosystems. PhD thesis, University of Mons

  • Zerouali A (2020) Replication package for Debian-based Docker images. https://doi.org/10.5281/zenodo.3765315

  • Zerouali A, Constantinou E, Mens T, Robles G, González-Barahona J (2018) An empirical analysis of technical lag in npm package dependencies. In: International conference on software reuse. https://doi.org/10.1007/978-3-319-90421-4_6. Springer, pp 95–110

  • Zerouali A, Cosentino V, Robles G, Gonzalez-Barahona JM, Mens T (2019a) Conpan: a tool to analyze packages in software containers. In: Proceedings of the 16th international conference on mining software repositories. https://doi.org/10.1109/MSR.2019.00089. IEEE Press, pp 592–596

  • Zerouali A, Mens T, Gonzalez-Barahona J, Decan A, Constantinou E, Robles G (2019b) A formal framework for measuring technical lag in component repositories—and its application to npm. J Softw: Evol Process. https://doi.org/10.1002/smr.2157

  • Zerouali A, Mens T, Robles G, Gonzalez-Barahona JM (2019c) On the relation between outdated Docker containers, severity vulnerabilities, and bugs. In: International conference on software analysis, evolution and reengineering. https://doi.org/10.1109/SANER.2019.8668013. IEEE, pp 491–501

  • Zhou J, Chen W, Wu G, Wei J (2019) SemiTagRec: a semi-supervised learning based tag recommendation approach for Docker repositories. In: International conference on software and systems reuse. https://doi.org/10.1007/978-3-030-22888-0_10. Springer, pp 132–148

  • Zimmermann M, Staicu C-A, Tenny C, Pradel M (2019) Small world with high risks: a study of security threats in the npm ecosystem. In: USENIX security symposium, pp 1–16

Download references

Acknowledgements

This research is carried out in the context of the Excellence of Science project 30446992 SECO-Assist financed by FWO-Vlaanderen and F.R.S.-FNRS. We acknowledge the support of the Government of Spain through project “BugBirth” (RTI2018-101963-B-100).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ahmed Zerouali.

Additional information

Communicated by: Emad Shihab and David Lo

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This article belongs to the Topical Collection: Software Analysis, Evolution and Reengineering (SANER)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zerouali, A., Mens, T., Decan, A. et al. A multi-dimensional analysis of technical lag in Debian-based Docker images. Empir Software Eng 26, 19 (2021). https://doi.org/10.1007/s10664-020-09908-6

Download citation

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s10664-020-09908-6

Keywords

Navigation