skip to main content
research-article

How Do Successful and Failed Projects Differ? A Socio-Technical Analysis

Published: 12 July 2022 Publication History

Abstract

Software development is at the intersection of the social realm, involving people who develop the software, and the technical realm, involving artifacts (code, docs, etc.) that are being produced. It has been shown that a socio-technical perspective provides rich information about the state of a software project.
In particular, we are interested in socio-technical factors that are associated with project success. For this purpose, we frame the task as a network classification problem. We show how a set of heterogeneous networks composed of social and technical entities can be jointly embedded in a single vector space enabling mathematically sound comparisons between distinct software projects. Our approach is specifically designed using intuitive metrics stemming from network analysis and statistics to ease the interpretation of results in the context of software engineering wisdom. Based on a selection of 32 open source projects, we perform an empirical study to validate our approach considering three prediction scenarios to test the classification model’s ability generalizing to (1) randomly held-out project snapshots, (2) future project states, and (3) entirely new projects.
Our results provide evidence that a socio-technical perspective is superior to a pure social or technical perspective when it comes to early indicators of future project success. To our surprise, the methodology proposed here even shows evidence of being able to generalize to entirely novel (project hold-out set) software projects reaching predication accuracies of 80%, which is a further testament to the efficacy of our approach and beyond what has been possible so far. In addition, we identify key features that are strongly associated with project success. Our results indicate that even relatively simple socio-technical networks capture highly relevant and interpretable information about the early indicators of future project success.

References

[1]
Andrew Meneely, Laurie Williams, Will Snipes, and Jason Osborne. 2008. Predicting failures with developer networks and social network analysis. In Proceedings of the International Symposium on Foundations of Software Engineering (FSE’08). ACM, 13–23.
[2]
Andrew Meneely and Laurie Williams. 2011. Socio-technical developer networks: Should we trust our measurements? In Proceedings of the International Conference on Software Engineering. ACM, 281–290.
[3]
Mitchell Joblin, Sven Apel, Claus Hunsen, and Wolfgang Mauerer. 2017. Classifying developers into core and peripheral: An empirical study on count and network metrics. In Proceedings of the International Conference on Software Engineering (ICSE’17). IEEE.
[4]
C. Bird, N. Nagappan, H. Gall, B. Murphy, and P. Devanbu. 2009. Putting it all together: Using socio-technical networks to predict failures. In Proceedings of the International Symposium on Software Reliability Engineering (ISSRE’09). 109–119.
[5]
Christian Bird, Nachiappan Nagappan, Brendan Murphy, Harald Gall, and Premkumar Devanbu. 2011. Don’t touch my code! Examining the effects of ownership on software quality. In Proceedings of the European Software Engineering Conference and the International Symposium on the Foundations of Software Engineering (ESEC/FSE’11). ACM, 4–14.
[6]
Luis López-Fernández, Gregorio Robles, Jesus M. Gonzalez-Barahona, and Israel Herraiz. 2009. Applying social network analysis techniques to community-driven libre software projects. Integrated Approaches in Information Technology and Web Engineering: Advancing Organizational Knowledge Sharing 1 (2009), 28–50.
[7]
Mitchell Joblin, Wolfgang Mauerer, Sven Apel, Janet Siegmund, and Dirk Riehle. 2015. From developer networks to verified communities: A fine-grained approach. In Proceedings of the International Conference on Software Engineering (ICSE’15). IEEE, 563–573.
[8]
Mitchell Joblin, Sven Apel, and Wolfgang Mauerer. 2017. Evolutionary trends of developer coordination: A network approach. Empirical Software Engineering (2017), 1–45.
[9]
Jailton Coelho and Marco Tulio Valente. 2017. Why modern open source projects fail. In Proceedings of the International Symposium on Foundations of Software Engineering (FSE’17). ACM, 186–196.
[10]
P Erdős and A Rényi. 1959. On random graphs. Publicationes Mathematicae 6 (1959), 290–297.
[11]
M. Girvan and M. E. J. Newman. 2002. Community structure in social and biological networks. Proceedings of the National Academy of Sciences 99, 12 (2002), 7821–7826. DOI:
[12]
M. E. J. Newman. 2005. Power laws, Pareto distributions and Zipf’s law. Contemporary Physics 46, 5 (2005), 323–351. DOI:
[13]
Erzsébet Ravasz and Albert-László Barabási. 2003. Hierarchical organization in complex networks. Physical Review E 67, 2 (2003), 026112.
[14]
Luis López-Fernández, Gregorio Robles, Jesus M. Gonzalez-Barahona. 2004. Applying social network analysis to the information in CVS repositories. In Proceedings of the International Workshop on Mining Software Repositories. 101–105.
[15]
Andrejs Jermakovics, Alberto Sillitti, and Giancarlo Succi. 2011. Mining and visualizing developer networks from version control systems. In Proceedings of the International Workshop on Cooperative and Human Aspects of Software Engineering. ACM, 24–31.
[16]
Antonio Terceiro, Luiz Romario Rios, and Christina Chavez. 2010. An empirical study on the structural complexity introduced by core and peripheral developers in free software projects. In Proceedings of the Brazilian Symposium on Software Engineering. IEEE, 21–29.
[17]
Thomas Bock, Angelika Schmid, and Sven Apel. 2021. Measuring and modeling group dynamics in open-source software development: A tensor decomposition approach. Transactions on Software Engineering and Methodology (TOSEM) 31, 2 (2021).
[18]
Wolfgang Mauerer, Mitchell Joblin, Damian Andrew Andrew Tamburri, Carlos Paradis, Rick Kazman, and Sven Apel. 2021. In search of socio-technical congruence: A large-scale longitudinal study. IEEE Transactions on Software Engineering (2021), 1–1. DOI:
[19]
Narciso Cerpa, Matthew Bardeen, Barbara Kitchenham, and June Verner. 2010. Evaluating logistic regression models to estimate software project outcomes. Information and Software Technolology 52, 9 (2010), 934–944.
[20]
Kevin Crowston, James Howison, and Hala Annabi. 2006. Information systems success in free and open source software development: Theory and measures. Software Process: Improvement and Practice 11, 2 (2006), 123–148.
[21]
D. Surian, Y. Tian, D. Lo, H. Cheng, and E. Lim. 2013. Predicting project outcome leveraging socio-technical network patterns. In European Conference on Software Maintenance and Reengineering. 47–56.
[22]
Kweku Ewusi-Mensah and Zbigniew H, Przasnyski. 1991. On information systems project abandonment: An exploratory study of organizational practices. MIS Quarterly (1991), 67–86.
[23]
Christian Bird, Peter C. Rigby, Earl T. Barr, David J. Hamilton, Daniel M. German, and Prem Devanbu. 2009. The promises and perils of mining git. In Proceedings of the IEEE International Working Conference on Mining Software Repositories. IEEE Computer Society, 1–10.
[24]
Mathias Pohl and Stephan Diehl. 2008. What dynamic network metrics can tell us about developer roles. In Proceedings of the International Workshop on Cooperative and Human Aspects of Software Engineering. ACM, 81–84.
[25]
Q. Hong, S. Kim, S. C. Cheung, and C. Bird. 2011. Understanding a developer social network and its evolution. In Proceedings of the International Conference on Software Maintenance (ICSM’11). 323–332.
[26]
Michele Berlingerio, Danai Koutra, Tina Eliassi-Rad, and Christos Faloutsos. 2013. Network similarity via multiple social theories. In Proceedings of the International Conference on Advances in Social Networks Analysis and Mining. ACM.
[27]
Garry Robins and Malcolm Alexander. 2004. Small worlds among interlocking directors: Network structure and distance in bipartite graphs. Computational & Mathematical Organization Theory (May 2004), 69–94.
[28]
Trevor Hastie, Robert Tibshirani, and Martin Wainwright. 2015. Statistical Learning with Sparsity: The Lasso and Generalizations.
[29]
Jeff Gill. 1999. The insignificance of null hypothesis significance testing. Political Research Quarterly 52, 3 (1999), 647–674.
[30]
Monya Baker. 2016. Statisticians issue warning over misuse of P values. Nature 531 (2016), 151.
[31]
Blakeley B. McShane, David Gal, Andrew Gelman, Christian Robert, and Jennifer L. Tackett. 2019. Abandon statistical significance. The American Statistician 73 (2019), 235–245.
[32]
Ulrik Brandes and Thomas Erlebach. 2005. Network Analysis: Methodological Foundations. Springer.
[33]
Leo Breiman. 2001. Random forests. Machine Learning 45, 1 (2001), 5–32.
[34]
Richard Bellman. 1966. Dynamic programming. Science 153, 3731 (1966), 34–37.
[35]
Wei Hu and Kenny Wong. 2013. Using citation influence to predict software defects. In Proceedings of the Conference on Mining Software Repositories (MSR’13). IEEE Press, 10.
[36]
A. Sarma, L. Maccherone, P. Wagstrom, and J. Herbsleb. 2009. Tesseract: Interactive visual exploration of socio-technical relationships in software development. In Proceedings of the International Conference on Software Engineering (ICSE’09). 23–33.
[37]
Galit Shmueli. 2010. To explain or to predict?Statistical Science 25, 3 (2010), 289–310.
[38]
Alberto Bacchelli and Christian Bird. 2013. Expectations, outcomes, and challenges of modern code review. In Proceedings of the International Conference on Software Engineering (ICSE’13). 712–721.
[39]
Eric Raymond. 1999. The cathedral and the bazaar. Knowledge, Technology & Policy 12, 3 (1999), 23–49.
[40]
Jing Wang, Patrick Shih, and John Carroll. 2015. Revisiting Linus’s law: Benefits and challenges of open source software peer review. International Journal of Human-Computer Studies 77 (2015), 52–65.

Cited By

View all
  • (2025)Metric cake shop: A serious game for supporting education on ISO/IEC/IEEE 15939:2017 – Systems and software engineering – Measurement process in the context of an undergraduate software engineering courseComputer Standards & Interfaces10.1016/j.csi.2024.10387991(103879)Online publication date: Jan-2025
  • (2024)Measuring and Mining Community Evolution in Developer Social Networks with Entropy-Based IndicesACM Transactions on Software Engineering and Methodology10.1145/368883234:1(1-43)Online publication date: 16-Aug-2024
  • (2024)Sustainability Forecasting for Deep Learning Packages2024 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER)10.1109/SANER60148.2024.00106(981-992)Online publication date: 12-Mar-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Software Engineering and Methodology
ACM Transactions on Software Engineering and Methodology  Volume 31, Issue 4
October 2022
867 pages
ISSN:1049-331X
EISSN:1557-7392
DOI:10.1145/3543992
  • Editor:
  • Mauro Pezzè
Issue’s Table of Contents

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 12 July 2022
Online AM: 08 February 2022
Accepted: 01 December 2021
Revised: 01 November 2021
Received: 01 April 2021
Published in TOSEM Volume 31, Issue 4

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Empirical software engineering
  2. statistical network analysis
  3. quantitative software engineering
  4. information system success
  5. socio-technical networks

Qualifiers

  • Research-article
  • Refereed

Funding Sources

  • DFG

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)192
  • Downloads (Last 6 weeks)15
Reflects downloads up to 16 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2025)Metric cake shop: A serious game for supporting education on ISO/IEC/IEEE 15939:2017 – Systems and software engineering – Measurement process in the context of an undergraduate software engineering courseComputer Standards & Interfaces10.1016/j.csi.2024.10387991(103879)Online publication date: Jan-2025
  • (2024)Measuring and Mining Community Evolution in Developer Social Networks with Entropy-Based IndicesACM Transactions on Software Engineering and Methodology10.1145/368883234:1(1-43)Online publication date: 16-Aug-2024
  • (2024)Sustainability Forecasting for Deep Learning Packages2024 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER)10.1109/SANER60148.2024.00106(981-992)Online publication date: 12-Mar-2024
  • (2024)Sources of Underproduction in Open Source Software2024 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER)10.1109/SANER60148.2024.00081(740-751)Online publication date: 12-Mar-2024
  • (2024)Smart recommender for the configuration of software project development teamsExpert Systems with Applications10.1016/j.eswa.2024.125141(125141)Online publication date: Aug-2024
  • (2024)Teaching Theorizing in Software Engineering ResearchHandbook on Teaching Empirical Software Engineering10.1007/978-3-031-71769-7_3(31-69)Online publication date: 25-Dec-2024
  • (2024)Is There a Correlation Between Readme Content and Project Meta‐Characteristics?Software: Practice and Experience10.1002/spe.339055:3(589-609)Online publication date: 18-Nov-2024
  • (2024)On the sustainability of deep learning projectsJournal of Software: Evolution and Process10.1002/smr.264536:7Online publication date: 14-Jul-2024
  • (2023)Hierarchical and Hybrid Organizational Structures in Open-source Software Projects: A Longitudinal StudyACM Transactions on Software Engineering and Methodology10.1145/356994932:4(1-29)Online publication date: 26-May-2023

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Full Text

View this article in Full Text.

Full Text

HTML Format

View this article in HTML Format.

HTML Format

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media