Skip to main content

The Metrics to Evaluate the Health Status of OSS Projects Based on Factor Analysis

  • Conference paper
  • First Online:
Computer Supported Cooperative Work and Social Computing (ChineseCSCW 2019)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1042))

Abstract

As open-source software (OSS) development is becoming a trend, an increasing number of businesses and developers are joining OSS projects. For project managers, developers and users, understanding the current health status of a project is very important to manage a development process, select the open-source projects to development or to adopt the software packages developed by projects. Therefore, an efficient approach to evaluate the health status of the open-source project is needed. Unfortunately, although many approaches including metrics have been proposed, they are designed in arbitrary ways. In this paper, a math ematical tool, i.e., factor analysis, is used to build a health evaluation model for OSS projects. As far as we know, this is the first time that factor analysis has been applied to evaluate OSS projects. This model is based on GitHub data and uses the basic indexes that are closely related to the health status of the projects as the input. Then, six new synthetic metrics, namely community activity, project popularity, development activity, completeness, responsiveness and persistence are obtained through factor analysis, which can be used to calculate the overall health score of a project. Moreover, in order to verify the effectiveness of this model, it is applied to some real projects and the results show that the overall scores achieved by this model can reflect the health status of the projects.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://octoverse.GitHub.com/.

References

  1. Bird, C., Gall, H., Murphy, B., Devanbu, P.: An analysis of the effect of code ownership on software quality across windows, eclipse, and firefox (2010)

    Google Scholar 

  2. Bird, C., Nagappan, N., Murphy, B., Gall, H., Devanbu, P.: Don’t touch my code!: examining the effects of ownership on software quality. In: Proceedings of the 19th ACM SIGSOFT Symposium and the 13th European Conference on Foundations of Software Engineering, pp. 4–14. ACM (2011)

    Google Scholar 

  3. Borges, H., Hora, A., Valente, M.T.: Understanding the factors that impact the popularity of github repositories. In: 2016 IEEE International Conference on Software Maintenance and Evolution (ICSME), pp. 334–344. IEEE (2016)

    Google Scholar 

  4. Cattell, R.B.: The scree test for the number of factors. Multivar. Behav. Res. 1(2), 245–276 (1966)

    Article  Google Scholar 

  5. Farah, G., Tejada, J.S., Correal, D.: OpenHub: a scalable architecture for the analysis of software quality attributes. In: Proceedings of the 11th Working Conference on Mining Software Repositories, pp. 420–423. ACM (2014)

    Google Scholar 

  6. Gamalielsson, J., Lundell, B., Lings, B.: Responsiveness as a measure for assessing the health of OSS ecosystems. In: Proceedings of the 2nd International Workshop on Building Sustainable Open Source Communities (OSCOMM 2010), pp. 1–8. Tampere University of Technology, Tampere (2010)

    Google Scholar 

  7. Gousios, G., Spinellis, D.: GHTorrent: GitHub’s data from a firehose. In: 2012 9th IEEE Working Conference on Mining Software Repositories (MSR), pp. 12–21. IEEE (2012)

    Google Scholar 

  8. Hippel, E.V., Krogh, G.V.: Open source software and the “private-collective” innovation model: issues for organization science. Organ. Sci. 14(2), 209–223 (2003)

    Article  Google Scholar 

  9. Hu, Y., Zhang, J., Bai, X., Yu, S., Yang, Z.: Influence analysis of github repositories. SpringerPlus 5(1), 1268 (2016)

    Article  Google Scholar 

  10. Jensen, C., Scacchi, W.: Data mining for software process discovery in open source software development communities. In: Proceedings of Workshop on Mining Software Repositories, pp. 96–100. IET (2004)

    Google Scholar 

  11. Junior, J.H., Joseph, F., Anderson, R.E., TATHAM, R.L., et al.: Multivariate Data Analysis with Readings. Macmillan London (1992)

    Google Scholar 

  12. Kaiser, H.F.: The application of electronic computers to factor analysis. Educ. Psychol. Meas. 20(1), 141–151 (1960)

    Article  Google Scholar 

  13. Manikas, K., Hansen, K.M.: Reviewing the health of software ecosystems - a conceptual framework proposal (2013)

    Google Scholar 

  14. Van der Linden, F., Lundell, B., Marttiin, P.: Commodification of industrial software: a case for open source. IEEE Softw. 26(4), 77–83 (2009)

    Article  Google Scholar 

  15. MacCallum, R.C., Widaman, K.F., Zhang, S., Hong, S.: Sample size in factor analysis. Psychol. Methods 4(1), 84 (1999)

    Article  Google Scholar 

  16. Manikas, K., Hansen, K.M.: Software ecosystems-a systematic literature review. J. Syst. Softw. 86(5), 1294–1306 (2013)

    Article  Google Scholar 

  17. Mockus, A., Fielding, R.T., Herbsleb, J.: A case study of open source software development: the apache server. In: Proceedings of the 22nd International Conference on Software Engineering, pp. 263–272. ACM (2000)

    Google Scholar 

  18. Moon, J., Sproull, L.: Essence of Distributed Work. Online Communication and Collaboration: A Reader, p. 125 (2010)

    Google Scholar 

  19. Oriol, M., Franco-Bedoya, O., Franch, X., Marco, J.: Assessing open source communities’ health using service oriented computing concepts. In: 2014 IEEE Eighth International Conference on Research Challenges in Information Science (RCIS), pp. 1–6. IEEE (2014)

    Google Scholar 

  20. Ray, B., Posnett, D., Filkov, V., Devanbu, P.: A large scale study of programming languages and code quality in github. In: Proceedings of the 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering, pp. 155–165. ACM (2014)

    Google Scholar 

  21. Spearman, C.: “General intelligence,” objectively determined and measured. Am. J. Psychol. 15(2), 201–292 (1904)

    Article  Google Scholar 

  22. Tabachnick, B.G., Fidell, L.S.: Using Multivariate Statistics, 5th edn. Allyn & Bacon, Needham Height (2007)

    Google Scholar 

  23. Van Den Berk, I., Jansen, S., Luinenburg, L.: Software ecosystems: a software ecosystem strategy assessment model. In: Proceedings of the Fourth European Conference on Software Architecture, pp. 127–134. ACM (2010)

    Google Scholar 

  24. Van Maanen, J.E., Schein, E.H.: Toward a theory of organizational socialization (1977)

    Google Scholar 

  25. Wahyudin, D., Mustofa, K., Schatten, A., Biffl, S., Min Tjoa, A.: Monitoring the health status of open source web-engineering projects. Int. J. Web Inf. Syst. 3(1/2), 116–139 (2007)

    Article  Google Scholar 

  26. Wikipedia contributors: Spss – Wikipedia, the free encyclopedia (2018). https://en.wikipedia.org/w/index.php?title=SPSS&oldid=870276612. Accessed 16 Jan 2019

  27. Wikipedia contributors: Interplanetary file system – Wikipedia, the free encyclopedia (2019). https://en.wikipedia.org/w/index.php?title=InterPlanetary_File_System. Accessed 18 Jan 2019

  28. Wikipedia contributors: Tensorflow – Wikipedia, the free encyclopedia (2019). https://en.wikipedia.org/w/index.php?title=TensorFlow&oldid=878912059. Accessed 18 Jan 2019

Download references

Acknowledgment

This work is supported by National Key Research and Development Plan (No. 2018YFB1003800).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jian Cao .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Jiang, S., Cao, J., Prasad, M. (2019). The Metrics to Evaluate the Health Status of OSS Projects Based on Factor Analysis. In: Sun, Y., Lu, T., Yu, Z., Fan, H., Gao, L. (eds) Computer Supported Cooperative Work and Social Computing. ChineseCSCW 2019. Communications in Computer and Information Science, vol 1042. Springer, Singapore. https://doi.org/10.1007/978-981-15-1377-0_56

Download citation

  • DOI: https://doi.org/10.1007/978-981-15-1377-0_56

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-15-1376-3

  • Online ISBN: 978-981-15-1377-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics