Skip to main content
Log in

Modeling the impact of Python and R packages using dependency and contributor networks

  • Original Article
  • Published:
Social Network Analysis and Mining Aims and scope Submit manuscript

Abstract

This paper develops methods to estimate the factors that affect the impact of open-source software (OSS), measured by number of downloads, with a study of Python and R packages. The OSS community is characterized by a high level of collaboration and sharing which results in interactions between contributors as well as packages due to reuses. We use data collected from Depsy.org about the development activities of Python and R packages, and generate the dependency and contributor networks. We develop three Quasi-Poisson models for each of the Python and R communities using network characteristics, as well as author and package attributes. We find that the more derivative a package is (the more dependencies it has), the less likely it is to have a high impact. We also show that the centrality of a package in the dependency network measured by the out-degree, closeness centrality, and pagerank has a significant effect on its impact. Moreover, the closeness and weighted degree centralities of the developers in the Python and R contributor networks play an important role. We also find that introducing network features to a baseline model using only package features (e.g., number of authors, number of commits) improves the performance of the models.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

Notes

  1. Other network characteristics were removed because of the high correlations with the other measures included in the models.

References

Download references

Acknowledgements

This material is based on work supported by US Department of Agriculture (58-3AEU-7-0074) and the National Science Foundation under IGERT Grant DGE-1144860, Big Data Social Science. We acknowledge the Data Science for the Public Good Program 2017 participants Daniel Chen, Sayali Phadke, Eirik Iversen, and Ben Swartz.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Gizem Korkmaz.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

A preliminary version of the paper appeared in the Proceedings of 2018 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (Korkmaz et al. 2018).

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Korkmaz, G., Kelling, C., Robbins, C. et al. Modeling the impact of Python and R packages using dependency and contributor networks. Soc. Netw. Anal. Min. 10, 7 (2020). https://doi.org/10.1007/s13278-019-0619-1

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s13278-019-0619-1

Keywords

Navigation