Abstract
Docker is on the rise in today’s enterprise IT. It permits shipping applications inside portable containers, which run from so-called Docker images. Docker images are distributed in public registries, which also monitor their popularity. The popularity of an image directly impacts on its usage, and hence on the potential revenues of its developers. In this paper, we present a frequent pattern mining-based approach for understanding how to improve an image to increase its popularity. The results in this work can provide valuable insights to Docker image providers, helping them to design more competitive software products.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
An example of raw Docker image data is available at https://goo.gl/hibue1.
- 2.
As images are downloaded as compressed archives, their download size correspond to their compressed size (in GBs).
- 3.
The actual size of an image corresponds to its decompressed size (in GBs).
- 4.
Publicly available at https://goo.gl/ggvKN3.
- 5.
- 6.
The python code and the list of all the itemsets and popularity rules extracted can be found at https://github.com/di-unipi-socc/DockerImageMiner.
- 7.
We discarded very similar rules in order to have a broader overview.
References
Agrawal, R., Srikant, R., et al.: Fast algorithms for mining association rules. In: Proceedings of 20th International Conference Very Large Data Bases, VLDB, vol. 1215, pp. 487–499 (1994)
Bay, S.D., Pazzani, M.J.: Detecting group differences: mining contrast sets. Data Min. Knowl. Discov. 5(3), 213–246 (2001)
Berri, D.J., Schmidt, M.B., Brook, S.L.: Stars at the gate: the impact of star power on nba gate revenues. J. Sports Econ. 5(1), 33–50 (2004)
Brogi, A., Neri, D., Soldani, J.: DockerFinder: multi-attribute search of docker images. In: IC2E, pp. 273–278. IEEE (2017)
Franck, E., Nüesch, S.: Mechanisms of superstar formation in german soccer: empirical evidence. Eur. Sport Manag. Q. 8(2), 145–164 (2008)
Guidotti, R., Monreale, A., Rinzivillo, S., Pedreschi, D., Giannotti, F.: Retrieving points of interest from human systematic movements. In: Canal, C., Idani, A. (eds.) SEFM 2014. LNCS, vol. 8938, pp. 294–308. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-15201-1_19
Guidotti, R., Rossetti, G., Pedreschi, D.: Audio Ergo Sum. In: Milazzo, P., Varró, D., Wimmer, M. (eds.) STAF 2016. LNCS, vol. 9946, pp. 51–66. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-50230-4_5
Han, J., Pei, J., Yin, Y.: Mining frequent patterns without candidate generation. In: ACM SIGMOD Record, vol. 29, pp. 1–12. ACM (2000)
Harackiewicz, J.M., et al.: Predicting success in college: a longitudinal study of achievement goals and ability measures as predictors of interest and performance from freshman year through graduation. JEP 94(3), 562 (2002)
Hars, A., Ou, S.: Working for free? - motivations of participating in open source projects. IJEC 6(3), 25–39 (2002)
Herrera, F., Carmona, C.J., González, P., Del Jesus, M.J.: An overview on subgroup discovery: foundations and applications. KAIS 29(3), 495–525 (2011)
Joy, A.: Performance comparison between Linux containers and virtual machines. In: ICACEA, pp. 342–346, March 2015
Litman, B.R.: Predicting success of theatrical movies: an empirical study. J. Popular Cult. 16(4), 159–175 (1983)
Ma, Z., Sun, A., Cong, G.: On predicting the popularity of newly emerging hashtags in twitter. JASIST 64(7), 1399–1410 (2013)
Miell, I., Sayers, A.H.: Docker in Practice. Manning Publications Co., Shelter Island (2016)
Pahl, C., Brogi, A., Soldani, J., Jamshidi, P.: Cloud container technologies: a state-of-the-art review. IEEE Trans. Cloud Comput. (2017, in press)
Pappalardo, L., Cintia, P.: Quantifying the relation between performance and success in soccer. In: Advances in Complex Systems, p. 1750014 (2017)
Pappalardo, L., Cintia, P., Pedreschi, D., Giannotti, F., Barabasi, A.-L.: Human perception of performance. arXiv preprint arXiv:1712.02224 (2017)
Park, J., et al.: Style in the age of instagram: predicting success within the fashion industry using social media. In: CSCW, pp. 64–73. ACM (2016)
Penner, O., Pan, R.K., Petersen, A.M., Kaski, K., Fortunato, S.: On the predictability of future impact in science. Sci. Rep. 3, 3052 (2013)
Pollacci, L., Guidotti, R., Rossetti, G., Giannotti, F., Pedreschi, D.: The fractal dimension of music: geography, popularity and sentiment analysis. In: Guidi, B., Ricci, L., Calafate, C., Gaggi, O., Marquez-Barja, J. (eds.) GOODTECHS 2017. LNICST, vol. 233, pp. 183–194. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-76111-4_19
Sinatra, R., Wang, D., Deville, P., Song, C., Barabási, A.-L.: Quantifying the evolution of individual scientific impact. Science 354(6312), aaf5239 (2016)
Soltesz, S., et al.: Container-based operating system virtualization: a scalable, high-performance alternative to hypervisors. In: SIGOPS, vol. 41, pp. 275–287 (2007)
Tan, P.-N., et al.: Introduction to Data Mining. Pearson Education India (2006)
Trzciński, T., Rokita, P.: Predicting popularity of online videos using support vector regression. IEEE Trans. Multimedia 19(11), 2561–2570 (2017)
Wang, D., Song, C., Barabási, A.-L.: Quantifying long-term scientific impact. Science 342(6154), 127–132 (2013)
Weicheng, Y., Beijun, S., Ben, X.: Mining GitHub: why commit stops–exploring the relationship between developer’s commit pattern and le version evolution. In: APSEC, vol. 2, pp. 165–169. IEEE (2013)
Yu, Y., Yin, G., Wang, H., Wang, T.: Exploring the patterns of social behavior in GitHub. In: CrowdSoft, pp. 31–36. ACM (2014)
Zhou, Z.-H., Zhang, M.-L.: Multi-instance multi-label learning with application to scene classification. In: NIPS, pp. 1609–1616 (2007)
Acknowledgments
Work partly supported by the EU H2020 Program under the funding scheme “INFRAIA-1-2014-2015: Research Infrastructures” grant agreement 654024 “SoBigData” http://www.sobigdata.eu.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Nature Switzerland AG
About this paper
Cite this paper
Guidotti, R., Soldani, J., Neri, D., Brogi, A. (2018). Explaining Successful Docker Images Using Pattern Mining Analysis. In: Mazzara, M., Ober, I., Salaün, G. (eds) Software Technologies: Applications and Foundations. STAF 2018. Lecture Notes in Computer Science(), vol 11176. Springer, Cham. https://doi.org/10.1007/978-3-030-04771-9_9
Download citation
DOI: https://doi.org/10.1007/978-3-030-04771-9_9
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-04770-2
Online ISBN: 978-3-030-04771-9
eBook Packages: Computer ScienceComputer Science (R0)