Skip to main content

Explaining Successful Docker Images Using Pattern Mining Analysis

  • Conference paper
  • First Online:
Software Technologies: Applications and Foundations (STAF 2018)

Part of the book series: Lecture Notes in Computer Science ((LNPSE,volume 11176))

Abstract

Docker is on the rise in today’s enterprise IT. It permits shipping applications inside portable containers, which run from so-called Docker images. Docker images are distributed in public registries, which also monitor their popularity. The popularity of an image directly impacts on its usage, and hence on the potential revenues of its developers. In this paper, we present a frequent pattern mining-based approach for understanding how to improve an image to increase its popularity. The results in this work can provide valuable insights to Docker image providers, helping them to design more competitive software products.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    An example of raw Docker image data is available at https://goo.gl/hibue1.

  2. 2.

    As images are downloaded as compressed archives, their download size correspond to their compressed size (in GBs).

  3. 3.

    The actual size of an image corresponds to its decompressed size (in GBs).

  4. 4.

    Publicly available at https://goo.gl/ggvKN3.

  5. 5.

    http://www.borgelt.net/pyfim.html.

  6. 6.

    The python code and the list of all the itemsets and popularity rules extracted can be found at https://github.com/di-unipi-socc/DockerImageMiner.

  7. 7.

    We discarded very similar rules in order to have a broader overview.

References

  1. Agrawal, R., Srikant, R., et al.: Fast algorithms for mining association rules. In: Proceedings of 20th International Conference Very Large Data Bases, VLDB, vol. 1215, pp. 487–499 (1994)

    Google Scholar 

  2. Bay, S.D., Pazzani, M.J.: Detecting group differences: mining contrast sets. Data Min. Knowl. Discov. 5(3), 213–246 (2001)

    Article  Google Scholar 

  3. Berri, D.J., Schmidt, M.B., Brook, S.L.: Stars at the gate: the impact of star power on nba gate revenues. J. Sports Econ. 5(1), 33–50 (2004)

    Article  Google Scholar 

  4. Brogi, A., Neri, D., Soldani, J.: DockerFinder: multi-attribute search of docker images. In: IC2E, pp. 273–278. IEEE (2017)

    Google Scholar 

  5. Franck, E., Nüesch, S.: Mechanisms of superstar formation in german soccer: empirical evidence. Eur. Sport Manag. Q. 8(2), 145–164 (2008)

    Article  Google Scholar 

  6. Guidotti, R., Monreale, A., Rinzivillo, S., Pedreschi, D., Giannotti, F.: Retrieving points of interest from human systematic movements. In: Canal, C., Idani, A. (eds.) SEFM 2014. LNCS, vol. 8938, pp. 294–308. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-15201-1_19

    Chapter  Google Scholar 

  7. Guidotti, R., Rossetti, G., Pedreschi, D.: Audio Ergo Sum. In: Milazzo, P., Varró, D., Wimmer, M. (eds.) STAF 2016. LNCS, vol. 9946, pp. 51–66. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-50230-4_5

    Chapter  Google Scholar 

  8. Han, J., Pei, J., Yin, Y.: Mining frequent patterns without candidate generation. In: ACM SIGMOD Record, vol. 29, pp. 1–12. ACM (2000)

    Google Scholar 

  9. Harackiewicz, J.M., et al.: Predicting success in college: a longitudinal study of achievement goals and ability measures as predictors of interest and performance from freshman year through graduation. JEP 94(3), 562 (2002)

    Google Scholar 

  10. Hars, A., Ou, S.: Working for free? - motivations of participating in open source projects. IJEC 6(3), 25–39 (2002)

    Google Scholar 

  11. Herrera, F., Carmona, C.J., González, P., Del Jesus, M.J.: An overview on subgroup discovery: foundations and applications. KAIS 29(3), 495–525 (2011)

    Google Scholar 

  12. Joy, A.: Performance comparison between Linux containers and virtual machines. In: ICACEA, pp. 342–346, March 2015

    Google Scholar 

  13. Litman, B.R.: Predicting success of theatrical movies: an empirical study. J. Popular Cult. 16(4), 159–175 (1983)

    Article  Google Scholar 

  14. Ma, Z., Sun, A., Cong, G.: On predicting the popularity of newly emerging hashtags in twitter. JASIST 64(7), 1399–1410 (2013)

    Article  Google Scholar 

  15. Miell, I., Sayers, A.H.: Docker in Practice. Manning Publications Co., Shelter Island (2016)

    Google Scholar 

  16. Pahl, C., Brogi, A., Soldani, J., Jamshidi, P.: Cloud container technologies: a state-of-the-art review. IEEE Trans. Cloud Comput. (2017, in press)

    Google Scholar 

  17. Pappalardo, L., Cintia, P.: Quantifying the relation between performance and success in soccer. In: Advances in Complex Systems, p. 1750014 (2017)

    Article  MathSciNet  Google Scholar 

  18. Pappalardo, L., Cintia, P., Pedreschi, D., Giannotti, F., Barabasi, A.-L.: Human perception of performance. arXiv preprint arXiv:1712.02224 (2017)

  19. Park, J., et al.: Style in the age of instagram: predicting success within the fashion industry using social media. In: CSCW, pp. 64–73. ACM (2016)

    Google Scholar 

  20. Penner, O., Pan, R.K., Petersen, A.M., Kaski, K., Fortunato, S.: On the predictability of future impact in science. Sci. Rep. 3, 3052 (2013)

    Article  Google Scholar 

  21. Pollacci, L., Guidotti, R., Rossetti, G., Giannotti, F., Pedreschi, D.: The fractal dimension of music: geography, popularity and sentiment analysis. In: Guidi, B., Ricci, L., Calafate, C., Gaggi, O., Marquez-Barja, J. (eds.) GOODTECHS 2017. LNICST, vol. 233, pp. 183–194. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-76111-4_19

    Chapter  Google Scholar 

  22. Sinatra, R., Wang, D., Deville, P., Song, C., Barabási, A.-L.: Quantifying the evolution of individual scientific impact. Science 354(6312), aaf5239 (2016)

    Article  Google Scholar 

  23. Soltesz, S., et al.: Container-based operating system virtualization: a scalable, high-performance alternative to hypervisors. In: SIGOPS, vol. 41, pp. 275–287 (2007)

    Article  Google Scholar 

  24. Tan, P.-N., et al.: Introduction to Data Mining. Pearson Education India (2006)

    Google Scholar 

  25. Trzciński, T., Rokita, P.: Predicting popularity of online videos using support vector regression. IEEE Trans. Multimedia 19(11), 2561–2570 (2017)

    Article  Google Scholar 

  26. Wang, D., Song, C., Barabási, A.-L.: Quantifying long-term scientific impact. Science 342(6154), 127–132 (2013)

    Article  Google Scholar 

  27. Weicheng, Y., Beijun, S., Ben, X.: Mining GitHub: why commit stops–exploring the relationship between developer’s commit pattern and le version evolution. In: APSEC, vol. 2, pp. 165–169. IEEE (2013)

    Google Scholar 

  28. Yu, Y., Yin, G., Wang, H., Wang, T.: Exploring the patterns of social behavior in GitHub. In: CrowdSoft, pp. 31–36. ACM (2014)

    Google Scholar 

  29. Zhou, Z.-H., Zhang, M.-L.: Multi-instance multi-label learning with application to scene classification. In: NIPS, pp. 1609–1616 (2007)

    Google Scholar 

Download references

Acknowledgments

Work partly supported by the EU H2020 Program under the funding scheme “INFRAIA-1-2014-2015: Research Infrastructures” grant agreement 654024 “SoBigData” http://www.sobigdata.eu.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Riccardo Guidotti .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Guidotti, R., Soldani, J., Neri, D., Brogi, A. (2018). Explaining Successful Docker Images Using Pattern Mining Analysis. In: Mazzara, M., Ober, I., Salaün, G. (eds) Software Technologies: Applications and Foundations. STAF 2018. Lecture Notes in Computer Science(), vol 11176. Springer, Cham. https://doi.org/10.1007/978-3-030-04771-9_9

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-04771-9_9

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-04770-2

  • Online ISBN: 978-3-030-04771-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics