Abstract
As open-source software (OSS) development is becoming a trend, an increasing number of businesses and developers are joining OSS projects. For project managers, developers and users, understanding the current health status of a project is very important to manage a development process, select the open-source projects to development or to adopt the software packages developed by projects. Therefore, an efficient approach to evaluate the health status of the open-source project is needed. Unfortunately, although many approaches including metrics have been proposed, they are designed in arbitrary ways. In this paper, a math ematical tool, i.e., factor analysis, is used to build a health evaluation model for OSS projects. As far as we know, this is the first time that factor analysis has been applied to evaluate OSS projects. This model is based on GitHub data and uses the basic indexes that are closely related to the health status of the projects as the input. Then, six new synthetic metrics, namely community activity, project popularity, development activity, completeness, responsiveness and persistence are obtained through factor analysis, which can be used to calculate the overall health score of a project. Moreover, in order to verify the effectiveness of this model, it is applied to some real projects and the results show that the overall scores achieved by this model can reflect the health status of the projects.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
References
Bird, C., Gall, H., Murphy, B., Devanbu, P.: An analysis of the effect of code ownership on software quality across windows, eclipse, and firefox (2010)
Bird, C., Nagappan, N., Murphy, B., Gall, H., Devanbu, P.: Don’t touch my code!: examining the effects of ownership on software quality. In: Proceedings of the 19th ACM SIGSOFT Symposium and the 13th European Conference on Foundations of Software Engineering, pp. 4–14. ACM (2011)
Borges, H., Hora, A., Valente, M.T.: Understanding the factors that impact the popularity of github repositories. In: 2016 IEEE International Conference on Software Maintenance and Evolution (ICSME), pp. 334–344. IEEE (2016)
Cattell, R.B.: The scree test for the number of factors. Multivar. Behav. Res. 1(2), 245–276 (1966)
Farah, G., Tejada, J.S., Correal, D.: OpenHub: a scalable architecture for the analysis of software quality attributes. In: Proceedings of the 11th Working Conference on Mining Software Repositories, pp. 420–423. ACM (2014)
Gamalielsson, J., Lundell, B., Lings, B.: Responsiveness as a measure for assessing the health of OSS ecosystems. In: Proceedings of the 2nd International Workshop on Building Sustainable Open Source Communities (OSCOMM 2010), pp. 1–8. Tampere University of Technology, Tampere (2010)
Gousios, G., Spinellis, D.: GHTorrent: GitHub’s data from a firehose. In: 2012 9th IEEE Working Conference on Mining Software Repositories (MSR), pp. 12–21. IEEE (2012)
Hippel, E.V., Krogh, G.V.: Open source software and the “private-collective” innovation model: issues for organization science. Organ. Sci. 14(2), 209–223 (2003)
Hu, Y., Zhang, J., Bai, X., Yu, S., Yang, Z.: Influence analysis of github repositories. SpringerPlus 5(1), 1268 (2016)
Jensen, C., Scacchi, W.: Data mining for software process discovery in open source software development communities. In: Proceedings of Workshop on Mining Software Repositories, pp. 96–100. IET (2004)
Junior, J.H., Joseph, F., Anderson, R.E., TATHAM, R.L., et al.: Multivariate Data Analysis with Readings. Macmillan London (1992)
Kaiser, H.F.: The application of electronic computers to factor analysis. Educ. Psychol. Meas. 20(1), 141–151 (1960)
Manikas, K., Hansen, K.M.: Reviewing the health of software ecosystems - a conceptual framework proposal (2013)
Van der Linden, F., Lundell, B., Marttiin, P.: Commodification of industrial software: a case for open source. IEEE Softw. 26(4), 77–83 (2009)
MacCallum, R.C., Widaman, K.F., Zhang, S., Hong, S.: Sample size in factor analysis. Psychol. Methods 4(1), 84 (1999)
Manikas, K., Hansen, K.M.: Software ecosystems-a systematic literature review. J. Syst. Softw. 86(5), 1294–1306 (2013)
Mockus, A., Fielding, R.T., Herbsleb, J.: A case study of open source software development: the apache server. In: Proceedings of the 22nd International Conference on Software Engineering, pp. 263–272. ACM (2000)
Moon, J., Sproull, L.: Essence of Distributed Work. Online Communication and Collaboration: A Reader, p. 125 (2010)
Oriol, M., Franco-Bedoya, O., Franch, X., Marco, J.: Assessing open source communities’ health using service oriented computing concepts. In: 2014 IEEE Eighth International Conference on Research Challenges in Information Science (RCIS), pp. 1–6. IEEE (2014)
Ray, B., Posnett, D., Filkov, V., Devanbu, P.: A large scale study of programming languages and code quality in github. In: Proceedings of the 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering, pp. 155–165. ACM (2014)
Spearman, C.: “General intelligence,” objectively determined and measured. Am. J. Psychol. 15(2), 201–292 (1904)
Tabachnick, B.G., Fidell, L.S.: Using Multivariate Statistics, 5th edn. Allyn & Bacon, Needham Height (2007)
Van Den Berk, I., Jansen, S., Luinenburg, L.: Software ecosystems: a software ecosystem strategy assessment model. In: Proceedings of the Fourth European Conference on Software Architecture, pp. 127–134. ACM (2010)
Van Maanen, J.E., Schein, E.H.: Toward a theory of organizational socialization (1977)
Wahyudin, D., Mustofa, K., Schatten, A., Biffl, S., Min Tjoa, A.: Monitoring the health status of open source web-engineering projects. Int. J. Web Inf. Syst. 3(1/2), 116–139 (2007)
Wikipedia contributors: Spss – Wikipedia, the free encyclopedia (2018). https://en.wikipedia.org/w/index.php?title=SPSS&oldid=870276612. Accessed 16 Jan 2019
Wikipedia contributors: Interplanetary file system – Wikipedia, the free encyclopedia (2019). https://en.wikipedia.org/w/index.php?title=InterPlanetary_File_System. Accessed 18 Jan 2019
Wikipedia contributors: Tensorflow – Wikipedia, the free encyclopedia (2019). https://en.wikipedia.org/w/index.php?title=TensorFlow&oldid=878912059. Accessed 18 Jan 2019
Acknowledgment
This work is supported by National Key Research and Development Plan (No. 2018YFB1003800).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Jiang, S., Cao, J., Prasad, M. (2019). The Metrics to Evaluate the Health Status of OSS Projects Based on Factor Analysis. In: Sun, Y., Lu, T., Yu, Z., Fan, H., Gao, L. (eds) Computer Supported Cooperative Work and Social Computing. ChineseCSCW 2019. Communications in Computer and Information Science, vol 1042. Springer, Singapore. https://doi.org/10.1007/978-981-15-1377-0_56
Download citation
DOI: https://doi.org/10.1007/978-981-15-1377-0_56
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-15-1376-3
Online ISBN: 978-981-15-1377-0
eBook Packages: Computer ScienceComputer Science (R0)