skip to main content
10.1145/3540250.3549106acmconferencesArticle/Chapter ViewAbstractPublication PagesfseConference Proceedingsconference-collections
research-article

Quantifying community evolution in developer social networks

Published:09 November 2022Publication History

ABSTRACT

Understanding the evolution of communities in developer social networks (DSNs) around open source software (OSS) projects can provide valuable insights about the socio-technical process of OSS development. Existing studies show the evolutionary behaviors of social communities can effectively be described using patterns including split, shrink, merge, expand, emerge, and extinct. However, existing pattern-based approaches are limited in supporting quantitative analysis, and are potentially problematic for using the patterns in a mutually exclusive manner when describing community evolution. In this work, we propose that different patterns can occur simultaneously between every pair of communities during the evolution, just in different degrees. Four entropy-based indices are devised to measure the degree of community split, shrink, merge, and expand, respectively, which can provide a comprehensive and quantitative measure of community evolution in DSNs. The indices have properties desirable to quantify community evolution including monotonicity, and bounded maximum and minimum values that correspond to meaningful cases. They can also be combined to describe more patterns such as community emerge and extinct. We conduct studies with real-world OSS projects to evaluate the validity of the proposed indices. The results suggest the proposed indices can effectively capture community evolution, and are consistent with existing approaches in detecting evolution patterns in DSNs with an accuracy of 94.1%. The results also show that the indices are useful in predicting OSS team productivity with an accuracy of 0.718. In summary, the proposed approach is among the first to quantify the degree of community evolution with respect to different patterns, which is promising in supporting future research and applications about DSNs and OSS development.

References

  1. Abdelmonem A Afifi, Jenny B Kotlerman, Susan L Ettner, and Marie Cowan. 2007. Methods for improving regression analysis for skewed continuous or counted responses. Annu. Rev. Public Health, 28 (2007), 95–111. Google ScholarGoogle ScholarCross RefCross Ref
  2. Haldun Akoglu. 2018. User’s guide to correlation coefficients. Turkish journal of emergency medicine, 18, 3 (2018), 91–93. Google ScholarGoogle Scholar
  3. Mohamed Abdelrahman Aljemabi and Zhongjie Wang. 2018. Empirical study on the evolution of developer social networks. IEEE Access, 6 (2018), 51049–51060. Google ScholarGoogle ScholarCross RefCross Ref
  4. Sitaram Asur, Srinivasan Parthasarathy, and Duygu Ucar. 2009. An event-based framework for characterizing the evolutionary behavior of interaction graphs. ACM Transactions on Knowledge Discovery from Data (TKDD), 3, 4 (2009), 1–36. Google ScholarGoogle Scholar
  5. Nicolas Bettenburg. 2011. Mining development repositories to study the impact of collaboration on software systems. In Proceedings of the 19th ACM SIGSOFT symposium and the 13th European conference on Foundations of software engineering. 376–379. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Stefanie Betz, Samuel Fricker, Andrew Moss, Wasif Afzal, Mikael Svahnberg, Claes Wohlin, Jürgen Börstler, and Tony Gorschek. 2013. An evolutionary perspective on socio-technical congruence: The rubber band effect. In 2013 3rd International Workshop on Replication in Empirical Software Engineering Research. 15–24. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Christian Bird, Alex Gourley, Prem Devanbu, Michael Gertz, and Anand Swaminathan. 2006. Mining email social networks. In Proceedings of the 2006 international workshop on Mining software repositories. 137–143. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Christian Bird, David Pattison, Raissa D’Souza, Vladimir Filkov, and Premkumar Devanbu. 2008. Latent social structure in open source projects. In Proceedings of the 16th ACM SIGSOFT International Symposium on Foundations of software engineering. 24–35. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Thomas Bock, Angelika Schmid, and Sven Apel. 2021. Measuring and Modeling Group Dynamics in Open-Source Software Development: A Tensor Decomposition Approach. ACM Transactions on Software Engineering and Methodology (TOSEM), 31, 2 (2021), 1–50. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Piotr Bródka, Stanisł aw Saganowski, and Przemysł aw Kazienko. 2013. GED: the method for group evolution discovery in social networks. Social Network Analysis and Mining, 3, 1 (2013), 1–14. Google ScholarGoogle ScholarCross RefCross Ref
  11. Gemma Catolino, Fabio Palomba, Damian A Tamburri, Alexander Serebrenik, and Filomena Ferrucci. 2019. Gender diversity and women in software teams: How do they affect community smells? In 2019 IEEE/ACM 41st International Conference on Software Engineering: Software Engineering in Society (ICSE-SEIS). 11–20. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Tanmoy Chakraborty, Ayushi Dalmia, Animesh Mukherjee, and Niloy Ganguly. 2017. Metrics for community analysis: A survey. ACM Computing Surveys (CSUR), 50, 4 (2017), 1–37. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Jailton Coelho and Marco Tulio Valente. 2017. Why modern open source projects fail. In Proceedings of the 2017 11th Joint Meeting on Foundations of Software Engineering. 186–196. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Patricia Cohen, Stephen G West, and Leona S Aiken. 2014. Applied multiple regression/correlation analysis for the behavioral sciences. Psychology press. Google ScholarGoogle Scholar
  15. Melvin E Conway. 1968. How do committees invent. Datamation, 14, 4 (1968), 28–31. Google ScholarGoogle Scholar
  16. Kevin Crowston and James Howison. 2005. The social structure of free and open source software development. First Monday. Google ScholarGoogle Scholar
  17. Laura Dabbish, Colleen Stuart, Jason Tsay, and Jim Herbsleb. 2012. Social coding in GitHub: transparency and collaboration in an open software repository. In Proceedings of the ACM 2012 conference on computer supported cooperative work. 1277–1286. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Christine P Dancey and John Reidy. 2007. Statistics without maths for psychology. Pearson education. Google ScholarGoogle Scholar
  19. Nicolas Ducheneaut. 2005. Socialization in an open source software community: A socio-technical analysis. Computer Supported Cooperative Work (CSCW), 14, 4 (2005), 323–368. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Kate Ehrlich and Marcelo Cataldo. 2012. All-for-one and one-for-all? A multi-level analysis of communication patterns and individual performance in geographically distributed software development. In Proceedings of the ACM 2012 Conference on Computer Supported Cooperative Work. 945–954. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Andrew Gelman and Jennifer Hill. 2006. Data analysis using regression and multilevel/hierarchical models. Cambridge university press. Google ScholarGoogle Scholar
  22. Georgios Gousios and Diomidis Spinellis. 2012. GHTorrent: GitHub’s data from a firehose. In 2012 9th IEEE Working Conference on Mining Software Repositories (MSR). 12–21. Google ScholarGoogle ScholarCross RefCross Ref
  23. Derek Greene, Donal Doyle, and Padraig Cunningham. 2010. Tracking the evolution of communities in dynamic social networks. In 2010 international conference on advances in social networks analysis and mining. 176–183. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Jungpil Hahn, Jae Yun Moon, and Chen Zhang. 2008. Emergence of new project teams from open source software developer networks: Impact of prior collaboration ties. Information Systems Research, 19, 3 (2008), 369–391. Google ScholarGoogle ScholarCross RefCross Ref
  25. Anna Hannemann and Ralf Klamma. 2013. Community dynamics in open source software projects: Aging and social reshaping. In IFIP International Conference on Open Source Systems. 80–96. Google ScholarGoogle ScholarCross RefCross Ref
  26. Steffen Herbold, Aynur Amirfallah, Fabian Trautsch, and Jens Grabowski. 2021. A systematic mapping study of developer social network research. Journal of Systems and Software, 171 (2021), 110802. Google ScholarGoogle ScholarCross RefCross Ref
  27. Qiaona Hong, Sunghun Kim, Shing Chi Cheung, and Christian Bird. 2011. Understanding a developer social network and its evolution. In 2011 27th IEEE international conference on software maintenance (ICSM). 323–332. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Hao-Yun Huang, Qize Le, and Jitesh H Panchal. 2011. Analysis of the structure and evolution of an open-source community. Journal of Computing and Information Science in Engineering, 11, 3 (2011). Google ScholarGoogle ScholarCross RefCross Ref
  29. Carlos Jensen, Scott King, and Victor Kuechler. 2011. Joining free/open source software communities: An analysis of newbies’ first interactions on project mailing lists. In 2011 44th Hawaii international conference on system sciences. 1–10. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Mitchell Joblin. 2017. Structural and Evolutionary Analysis of Developer Networks. Ph. D. Dissertation. Universität Passau. Google ScholarGoogle Scholar
  31. Arora Kanika. 2015. Research methods: The essential knowledge base. Cengage learning. Google ScholarGoogle Scholar
  32. Ulrich Knief and Wolfgang Forstmeier. 2021. Violating the normality assumption may be the lesser of two evils. Behavior Research Methods, 53, 6 (2021), 2576–2590. Google ScholarGoogle ScholarCross RefCross Ref
  33. Zhixing Li, Yue Yu, Minghui Zhou, Tao Wang, Gang Yin, Long Lan, and Huaimin Wang. 2020. Redundancy, context, and preference: An empirical study of duplicate pull requests in OSS projects. IEEE Transactions on Software Engineering. Google ScholarGoogle Scholar
  34. Yu-Ru Lin, Hari Sundaram, Yun Chi, Junichi Tatemura, and Belle L Tseng. 2007. Blog community discovery and evolution based on mutual awareness expansion. In IEEE/WIC/ACM International Conference on Web Intelligence (WI’07). 48–56. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Daniel Lüdecke. 2021. Assessment of Regression Models Performance. https://cran.r-project.org/web/packages/performance/performance.pdf Google ScholarGoogle Scholar
  36. Mircea Lungu, Michele Lanza, Tudor Gîrba, and Romain Robbes. 2010. The small project observatory: Visualizing software ecosystems. Science of Computer Programming, 75, 4 (2010), 264–275. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Gregory Madey, Vincent Freeh, and Renee Tynan. 2002. The open source software development phenomenon: An analysis based on social network theory. In Americas Conference on Information Systems (AMCIS2002). 1806–1813. Google ScholarGoogle Scholar
  38. Andrew Meneely, Ben Smith, and Laurie Williams. 2013. Validating software metrics: A spectrum of philosophies. ACM Transactions on Software Engineering and Methodology (TOSEM), 21, 4 (2013), 1–28. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Tom Mens and Mathieu Goeminne. 2011. Analysing the evolution of social aspects of open source software ecosystems. In Proceedings of the Workshop on Software Ecosystems. 1–14. Google ScholarGoogle Scholar
  40. Kumiyo Nakakoji, Yasuhiro Yamamoto, Yoshiyuki Nishinaka, Kouichi Kishida, and Yunwen Ye. 2002. Evolution patterns of open-source software systems and communities. In Proceedings of the international workshop on Principles of software evolution. 76–85. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Mark EJ Newman. 2003. Mixing patterns in networks. Physical review E, 67, 2 (2003), 026126. Google ScholarGoogle Scholar
  42. Mark EJ Newman. 2004. Analysis of weighted networks. Physical review E, 70, 5 (2004), 056131. Google ScholarGoogle Scholar
  43. Mark EJ Newman and Michelle Girvan. 2004. Finding and evaluating community structure in networks. Physical review E, 69, 2 (2004), 026113. Google ScholarGoogle Scholar
  44. Kawin Ngamkajornwiwat, Dongsong Zhang, A Gunes Koru, Lina Zhou, and Robert Nolker. 2008. An exploratory study on the evolution of OSS developer communities. In Proceedings of the 41st Annual Hawaii International Conference on System Sciences (HICSS 2008). 305–305. Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. Tim O’Reilly. 1999. Lessons from open-source software development. Commun. ACM, 42, 4 (1999), 32–37. Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. Gergely Palla, Albert-László Barabási, and Tamás Vicsek. 2007. Quantifying social group evolution. Nature, 446, 7136 (2007), 664–667. Google ScholarGoogle Scholar
  47. Gergely Palla, Imre Derényi, Illés Farkas, and Tamás Vicsek. 2005. Uncovering the overlapping community structure of complex networks in nature and society. nature, 435, 7043 (2005), 814–818. Google ScholarGoogle Scholar
  48. Sebastiano Panichella, Gabriele Bavota, Massimiliano Di Penta, Gerardo Canfora, and Giuliano Antoniol. 2014. How developers’ collaborations identified from different sources tell us about code changes. In 2014 IEEE International Conference on Software Maintenance and Evolution. 251–260. Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. Sebastiano Panichella, Gerardo Canfora, Massimiliano Di Penta, and Rocco Oliveto. 2014. How the evolution of emerging collaborations relates to code changes: an empirical study. In Proceedings of the 22nd International Conference on Program Comprehension. 177–188. Google ScholarGoogle ScholarDigital LibraryDigital Library
  50. Martin Pinzger, Nachiappan Nagappan, and Brendan Murphy. 2008. Can developer-module networks predict failures? In Proceedings of the 16th ACM SIGSOFT International Symposium on Foundations of software engineering. 2–12. Google ScholarGoogle ScholarDigital LibraryDigital Library
  51. Uzma Raja and Marietta J Tretter. 2012. Defining and evaluating a measure of open source project survivability. IEEE Transactions on Software Engineering, 38, 1 (2012), 163–174. Google ScholarGoogle ScholarDigital LibraryDigital Library
  52. Lionel Robert and Daniel M Romero. 2015. Crowd size, diversity and performance. In Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems. 1379–1382. Google ScholarGoogle ScholarDigital LibraryDigital Library
  53. Giulio Rossetti and Rémy Cazabet. 2018. Community discovery in dynamic networks: a survey. ACM Computing Surveys (CSUR), 51, 2 (2018), 1–37. Google ScholarGoogle ScholarDigital LibraryDigital Library
  54. Holger Schielzeth, Niels J Dingemanse, Shinichi Nakagawa, David F Westneat, Hassen Allegue, Céline Teplitsky, Denis Réale, Ned A Dochtermann, László Zsolt Garamszegi, and Yimen G Araya-Ajoy. 2020. Robustness of linear mixed-effects models to violations of distributional assumptions. Methods in Ecology and Evolution, 11, 9 (2020), 1141–1152. Google ScholarGoogle ScholarCross RefCross Ref
  55. Roland Robert Schreiber and Matthäus Paul Zylka. 2020. Social Network Analysis in Software Development Projects: A Systematic Literature Review. International Journal of Software Engineering and Knowledge Engineering, 30, 03 (2020), 321–362. Google ScholarGoogle ScholarCross RefCross Ref
  56. Claude Elwood Shannon. 1948. A mathematical theory of communication. The Bell system technical journal, 27, 3 (1948), 379–423. Google ScholarGoogle Scholar
  57. Param Vir Singh. 2010. The small-world effect: The influence of macro-level properties of developer collaboration networks on open-source project success. ACM Transactions on Software Engineering and Methodology (TOSEM), 20, 2 (2010), 1–27. Google ScholarGoogle ScholarDigital LibraryDigital Library
  58. Didi Surian, David Lo, and Ee-Peng Lim. 2010. Mining collaboration patterns from a large developer network. In 2010 17th Working Conference on Reverse Engineering. 269–273. Google ScholarGoogle ScholarDigital LibraryDigital Library
  59. MM Mahbubul Syeed, Imed Hammouda, and Tarja Systä. 2013. Evolution of open source software projects: A systematic literature review.. Journal of Software, 8, 11 (2013), 2815–2829. Google ScholarGoogle Scholar
  60. Damian Andrew Andrew Tamburri, Fabio Palomba, and Rick Kazman. 2019. Exploring community smells in open-source: An automated approach. IEEE Transactions on software Engineering. Google ScholarGoogle Scholar
  61. Jason Tsay, Laura Dabbish, and James Herbsleb. 2014. Influence of social and technical factors for evaluating contribution in GitHub. In Proceedings of the 36th international conference on Software engineering. 356–366. Google ScholarGoogle ScholarDigital LibraryDigital Library
  62. Jason Tsay, Laura Dabbish, and James Herbsleb. 2014. Let’s talk about it: evaluating contributions through discussion in GitHub. In Proceedings of the 22nd ACM SIGSOFT international symposium on foundations of software engineering. 144–154. Google ScholarGoogle ScholarDigital LibraryDigital Library
  63. Matthew Van Antwerp and Greg Madey. 2010. The importance of social network structure in the open source software developer community. In 2010 43rd Hawaii International Conference on System Sciences. 1–10. Google ScholarGoogle ScholarDigital LibraryDigital Library
  64. Bogdan Vasilescu, Daryl Posnett, Baishakhi Ray, Mark G.J. van den Brand, Alexander Serebrenik, Premkumar Devanbu, and Vladimir Filkov. 2015. Gender and Tenure Diversity in GitHub Teams. In Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems (CHI ’15). Association for Computing Machinery, New York, NY, USA. 3789–3798. isbn:9781450331456 https://doi.org/10.1145/2702123.2702549 Google ScholarGoogle ScholarDigital LibraryDigital Library
  65. Jing Wang. 2012. Survival factors for Free Open Source Software projects: A multi-stage perspective. European Management Journal, 30, 4 (2012), 352–371. Google ScholarGoogle ScholarCross RefCross Ref
  66. Yi Wang, Defeng Guo, and Huihui Shi. 2007. Measuring the evolution of open source software systems with their communities. ACM SIGSOFT Software Engineering Notes, 32, 6 (2007), 7–es. Google ScholarGoogle ScholarDigital LibraryDigital Library
  67. Michael Weiss, Gabriella Moroiu, and Ping Zhao. 2006. Evolution of open source communities. In IFIP International Conference on Open Source Systems. 21–32. Google ScholarGoogle ScholarCross RefCross Ref
  68. Mairieli Wessel, Bruno Mendes De Souza, Igor Steinmacher, Igor S Wiese, Ivanilton Polato, Ana Paula Chaves, and Marco A Gerosa. 2018. The power of bots: Characterizing and understanding bots in oss projects. Proceedings of the ACM on Human-Computer Interaction, 2, CSCW (2018), 1–19. Google ScholarGoogle ScholarDigital LibraryDigital Library
  69. Jin Xu, Yongqin Gao, Scott Christley, and Gregory Madey. 2005. A topological analysis of the open souce software development community. In Proceedings of the 38th Annual Hawaii International Conference on System Sciences. 198a–198a. Google ScholarGoogle Scholar
  70. Qi Xuan and Vladimir Filkov. 2014. Building it together: Synchronous development in OSS. In Proceedings of the 36th International Conference on Software Engineering. 222–233. Google ScholarGoogle ScholarDigital LibraryDigital Library
  71. Jierui Zhang, Liang Wang, Zhiwen Zheng, and Xianping Tao. 2022. Social Community Evolution Analysis and Visualization in Open Source Software Projects. In Proceedings of the 23rd International Conference on Web Information Systems and Engineering. 1–8. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Quantifying community evolution in developer social networks
              Index terms have been assigned to the content through auto-classification.

              Recommendations

              Comments

              Login options

              Check if you have access through your login credentials or your institution to get full access on this article.

              Sign in
              • Published in

                cover image ACM Conferences
                ESEC/FSE 2022: Proceedings of the 30th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering
                November 2022
                1822 pages
                ISBN:9781450394130
                DOI:10.1145/3540250

                Copyright © 2022 ACM

                Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

                Publisher

                Association for Computing Machinery

                New York, NY, United States

                Publication History

                • Published: 9 November 2022

                Permissions

                Request permissions about this article.

                Request Permissions

                Check for updates

                Qualifiers

                • research-article

                Acceptance Rates

                Overall Acceptance Rate112of543submissions,21%

              PDF Format

              View or Download as a PDF file.

              PDF

              eReader

              View online with eReader.

              eReader