Quality evaluation meta-model for open-source software: multi-method validation study

Yılmaz, Nebi; Tarhan, Ayça Kolukısa

doi:10.1007/s11219-023-09658-w

Quality evaluation meta-model for open-source software: multi-method validation study

Research
Published: 06 February 2024

Volume 32, pages 487–541, (2024)
Cite this article

Software Quality Journal Aims and scope Submit manuscript

395 Accesses
2 Citations
Explore all metrics

Abstract

In recent years, open-source software (OSS) has attracted increasing attention due to its easy accessibility via cloud repositories, voluntary community, no vendor lock-in, and low total cost of ownership. In turn, specifying and evaluating OSS quality has become a significant challenge for OSS adoption in organizations that are inclined to use them. Although many OSS quality models have been proposed in the literature, the dynamic and diverse nature of OSS has caused these models to be heterogeneous in terms of structure and content. This has adversely affected the standardization of evaluations and led to the evaluation results obtained from different OSS quality models for the same purpose being incomparable and sometimes unreliable. Therefore, in this study, a meta-model for OSS quality (OSS-QMM), which employs a unified structure from existing quality models and enables the derivation of homogeneous models, has been proposed. For this purpose, a systematic and laborious effort has been spent via a step-based meta-model creation process including review-and-revise iterations. In order to validate the OSS-QMM, case study and expert opinion methods have been applied to answer three research questions (RQs) targeted to investigate practical applicability, results comparability, and effectiveness of using the meta-model. Multiple and embedded case study designs have been employed for evaluating three real ERP systems, and 20 subject matter experts have been interviewed during the validation process. The results of multi-faceted empirical studies have indicated that the OSS-QMM has addressed solving problems in OSS quality evaluation and its adoption with high degrees of confidence.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Matching terms of quality models and meta-models: toward a unified meta-model of OSS quality

Article 17 December 2022

Quality enhancement in OSS development process: a quantitative framework approach

Article 17 December 2024

Evaluation indicators for open-source software: a review

Article Open access 02 June 2021

Data availability

The data that support the findings of this study are openly available in Google Drive and Zenodo at the following URLs:

1. Definition of the terminologies in the SQMM, Zenodo, URL: https://doi.org/10.5281/zenodo.6367596

2. List of questions to obtain feedback from experts (Step 4.4), G. Drive, URL: https://tinyurl.com/2qwowtzh

3. Expert opinion (list of questions and expert answer sheet, Step 5), G. Drive, URL: https://tinyurl.com/2ow7ayua

4. Perform case studies 2 and 3: Zenodo. https://doi.org/10.5281/zenodo.7986369

References

Adeoye-Olatunde, O. A., & Olenik, N. L. (2021). Research and scholarly methods: Semi-structured interviews. Journal of the American College of Clinical Pharmacy, 4(10), 1358–1367.
Article Google Scholar
Adewumi, A., Misra, S., & Omoragbe, N. (2019). FOSSES: Framework for open- source software evaluation and selection. In: Software: Practice and Experience, 49(5), 780–812.
Article Google Scholar
Adewumi, A., Misra, S., Omoragbe, N., Crawford, B., & Soto, R. (2016). A systematic literature review of open source software quality assessment models. SpringerPlus, 5(1), 1936.
Article Google Scholar
Al-Dhaqm, A., Razak, S., Othman, S. H., Ngadi, A., Ahmed, M. N., & Ali Mohammed, A. (2017). Development and validation of a database forensic metamodel (DBFM). PloS One, 12(2), e0170793.
Article Google Scholar
Alsolai, H., & Roper, M. (2020). A systematic literature review of machine learning techniques for software maintainability prediction. Information and Software Technology, 119, 106214.
Article Google Scholar
Ardito, L., Coppola, R., Barbato, L., & Verga, D. (2020). A tool-based perspective on software code maintainability metrics: A systematic literature review. Scientific Programming, 2020.
Arthur, J. D., & Stevens, K. T. (1989). Assessing the adequacy of documentation through document quality indicators. In Proceedings. Conference on Software Maintenance, 40–49. IEEE.
Aversano, L., & Tortorella, M. (2013). Quality evaluation of floss projects: Application to ERP systems. Information and Software Technology, 55(7), 1260–1276.
Article Google Scholar
Aversano, L., Guardabascio, D., & Tortorella, M. (2017). Analysis of the documentation of ERP software projects. Procedia Computer Science, 121, 423–430.
Article Google Scholar
Bakar, A. D., Sultan, A. B. M., Zulzalil, H., & Din, J. (2012). Review on ‘maintainability’metrics in open source software. International Review on Computers and Software, 7(3), 903–907.
Google Scholar
Bayer, J., & Muthig, D. (2006). A view-based approach for improving software documentation practices. 13th Annual IEEE International Symposium and Workshop on Engineering of Computer-Based Systems (ECBS’06) (p. 10). IEEE.
Chapter Google Scholar
Beydoun, G., Low, G., Henderson-Sellers, B., Mouratidis, H., Gomez-Sanz, J. J., Pavon, J., & Gonzalez-Perez, C. (2009). FAML: A generic metamodel for MAS development. IEEE Transactions on Software Engineering, 35(6), 841–863.
Article Google Scholar
Boehm, B. W., Brown, H., & Lipow, M. (1978). Quantitative evaluation of software quality. In Proceedings of the 2nd International Conference on Software Engineering, 592–605.
Briand, L., Morasca, S., & Basili, V. (2002). An operational process for goal driven definition of measures. IEEE Transactions on Software Engineering, 28(12), 1106–1125.
Article Google Scholar
Brings, J., Daun, M., Keller, K., Obe, P. A., & Weyer, T. (2020). A systematic map on verification and validation of emergent behavior in software engineering research. Future Generation Computer Systems, 112, 1010–1037.
Article Google Scholar
Butler, S., Gamalielsson, J., Lundell, B., Brax, C., Mattsson, A., Gustavsson, T., & Lönroth, E. (2022). Considerations and challenges for the adoption of open source components in software-intensive businesses. Journal of Systems and Software, 186, 111152.
Article Google Scholar
Chakraborty, S. (2022). TOPSIS and modified TOPSIS: A comparative analysis. Decision Analytics Journal, 2, 100021.
Article Google Scholar
Chawla, M. K., & Chhabra, I. (2015, October). Sqmma: Software quality model for maintainability analysis. In Proceedings of the 8th Annual ACM India Conference, 9–17
Chidamber, S. R., & Kemerer, C. F. (1994). A metrics suite for object oriented design. IEEE Transactions on Software Engineering, 20(6), 476–493.
Article Google Scholar
Codemetrics. (2019). URL: https://plugins.jetbrains.com/plugin/12159-codemetrics
Dagpinar, M., & Jahnke, J. H. (2003, November). Predicting maintainability with object-oriented metrics-an empirical comparison. In 10th Working Conference on Reverse Engineering, 2003. WCRE 2003. Proceedings, 155–155. IEEE Computer Society.
Dromey, R. G. (1995). A model for software product quality. IEEE Transactions on Software Engineering, 21(2), 146–162.
Article Google Scholar
Dubey, S. K., & Rana, A. (2011). Assessment of maintainability metrics for object-oriented software system. ACM SIGSOFT Software Engineering Notes, 36(5), 1–7.
Article Google Scholar
Duijnhouwer, F. W., & Widdows, C. (2003). Capgemini expert letter open source maturity model. Retrieved: 30 April 2022. Capgemini. URL: tinyurl.com/yxdbvjk6
Dweiri, F., Kumar, S., Khan, S. A., & Jain, V. (2016). Designing an integrated AHP based decision support system for supplier selection in automotive industry. Expert Systems with Applications, 62, 273–283.
Article Google Scholar
Eghan, E. E., Alqahtani, S. S., Forbes, C., & Rilling, J. (2019). API trustworthiness: An ontological approach for software library adoption. Software Quality Journal, 27(3), 969–1014.
Article Google Scholar
Frantz, R. Z., Rehbein, M. H., Berlezi, R., & Roos-Frantz, F. (2019). Ranking open source application integration frameworks based on maintainability metrics: A review of five-year evolution. Software: Practice and Experience, 49(10), 1531–1549.
Google Scholar
Garcia, F., Bertoa, M. F., Calero, C., Vallecillo, A., Ruiz, F., Piattini, M., & Genero, M. (2006). Towards a consistent terminology for software measurement. Information and Software Technology, 48(8), 631–644.
Article Google Scholar
Gezici, B., Özdemir, N., Yılmaz, N., Coşkun, E., Tarhan, A., & Chouseinoglou, O. (2019). Quality and success in open source software: A systematic mapping. 45th Euromicro Conference on Software Engineering and Advanced Applications (SEAA), 363–370. IEEE.
Goeb, A. (2013). A meta model for software architecture conformance and quality assessment. Electronic Communications of the EASST, 60.
Grady, R. B. (1992). Practical software metrics for project management and process improvement. Prentice Hall.
Google Scholar
Hanefi Calp, M., Arici, N., Enstitüsü, B., Üniversitesi, G., & Ankara, T. (2011). Nesne Yönelimli Tasarım Metrikleri ve Kalite Özellikleriyle İlişkisi. Politeknik Dergisi Journal of Polytechnic Cilt Digital Object Identifier, 14141(10), 9–14.
Google Scholar
Hanine, M., Boutkhoum, O., Tikniouine, A., & Agouti, T. (2016). Application of an integrated multi-criteria decision making AHP-TOPSIS methodology for ETL software selection. Springerplus, 5(1), 1–17.
Article Google Scholar
Hasnain, S., Ali, M. K., Akhter, J., Ahmed, B., & Abbas, N. (2020). Selection of an industrial boiler for a soda-ash production plant using analytical hierarchy process and TOPSIS approaches. Case Studies in Thermal Engineering, 19, 100636.
Article Google Scholar
Hauge, O., Osterlie, T., & Sorensen, C. F. (2009) An empirical study on selection of open source software-preliminary results. In: ICSE Workshop on Emerging Trends in Free/Libre/Open Source Software Research and Development. IEEE.
Hmood, A., Keivanloo, I., & Rilling, J. (2012, July). SE-EQUAM-an evolvable quality metamodel. In 2012 IEEE 36th Annual Computer Software and Applications Conference Workshops, 334–339. IEEE
Ho, W., & Ma, X. (2018). The state-of-the-art integrations and applications of the analytic hierarchy process. European Journal of Operational Research, 267(2), 399–414.
Article MathSciNet Google Scholar
IEEE standard glossary of software engineering terminology. (1990). IEEE Standart 610.12-1990. pp. 1–84.
IEEE Standard for a Software Quality Metrics Methodology. (1998). In IEEE Standart 1061–1998.
Işıklar, G., & Büyüközkan, G. (2007). Using a multi-criteria decision making approach to evaluate mobile phone alternatives. Computer Standards & Interfaces, 29(2), 265–274.
Article Google Scholar
ISO, International Standard ISO VIM. (1993). International vocabulary of basic and general terms in metrology, International Standards Organization, Geneva, Switzerland, second edition.
ISO/IEC 14598-3:1999. (1999). Information technology-software product evaluation-Part 3: Process for developers. International Organization for Standardization, Geneva.
ISO/IEC 15939:2007. (2007). Information Technology—Software Engineering—Software Measurement Process. International Organization for Standardization, Geneva.
ISO/IEC 9126-1:2001. (2001). Software engineering - Product quality -Part 1: Quality model, international organization for standardization, Geneva, Switzerland.
Jha, S., Kumar, R., Abdel-Basset, M., Priyadarshini, I., Sharma, R., & Long, H. V. (2019). Deep learning approach for software maintainability metrics prediction. Ieee Access, 7, 61840–61855.
Article Google Scholar
Jiang, S., Cao, J., & Qi, Q. (2021). Exploring development-related factors affecting the popularity of open source software projects. In 2021 IEEE 24th International Conference on Computer Supported Cooperative Work in Design (CSCWD), 244–249. IEEE.
Joshi, A., Kale, S., Chandel, S., & Pal, D. K. (2015). Likert scale: Explored and explained. British Journal of Applied Science & Technology, 7(4), 396.
Article Google Scholar
Khashei-Siuki, A., & Sharifan, H. (2020). Comparison of AHP and FAHP methods in determining suitable areas for drinking water harvesting in Birjand aquifer. Iran. Groundwater for Sustainable Development, 10, 100328.
Article Google Scholar
Khatri, S. K., & Singh, I. (2016). Evaluation of open source software and improving its quality. 5th International Conference on Reliability, Infocom Technologies and Optimization (ICRITO). IEEE.
Kim, H. M. (1999). Representing and reasoning about quality using enterprise models. PhD thesis, Dept. Mechanical and Industrial Engineering, University of Toronto, Canada.
Kitchenham, B., Hughes, R. T., & Linkman, S. G. (2001). Modeling software measurement data. IEEE Transactions on Software Engineering, 27(9), 788–804.
Article Google Scholar
Kläs, M., Lampasona, C., Nunnenmacher, S., Wagner, S., Herrmannsdörfer, M., & Lochmann, K. (2010). How to evaluate meta-models for software quality. In Proceedings of the 20th International Workshop on Software Measurement.
Lenarduzzi, V., Taibi, D., Tosi, D., Lavazza, L., & Morasca, S. (2020). Open source software evaluation, selection, and adoption: A systematic literature review. In: 46th Euromicro Conference on Software Engineering and Advanced Applications (SEAA). IEEE
Li, J., Conradi, R., Bunse, C., Torchiano, M., Slyngstad, O. P. N., & Morisio, M. (2009). Development with off-the-shelf components: 10 facts. IEEE Software, 26(2), 80–87.
Article Google Scholar
List of questions and expert opinion (Step-5). (2023). Retrieved: 20 June 2023. URL: https://tinyurl.com/2ow7ayua
Magaldi, D., & Berler, M. (2020). Semi-structured interviews. Encyclopedia of Personality and Individual Differences, 4825–4830.
Mc Call, J. A., Richards, P. K., & Walters, G. F. (1977). Factors in software quality, volumes I, II, and III. US Rome Air Development Center Reports, US Department of Commerce, USA.
Mens, T., Doctors, L., Habra, N., Vanderose, B., & Kamseu, F. (2011). Qualgen: Modeling and analysing the quality of evolving software systems. In: 15th European Conference on Software Maintenance and Reengineering. IEEE.
MetricsReloaded. (2004). URL: https://plugins.jetbrains.com/plugin/93-metricsreloaded
Nistala, P., Nori, K. V., & Reddy, R. (2019). Software quality models: A systematic mapping study. International Conference on Software and System Processes (ICSSP), 125–134. IEEE.
Object Management Group (OMG). (2019). Meta Object Facility (MOF). Core specification version 2.5.1. Retrieved: 2 October 2022. URL: https://www.omg.org/spec/MOF/2.5.1/PDF
Othman, S. H., & Beydoun, G. (2010). Metamodelling approach to support disaster management knowledge sharing. In: 21st Australasian Conference on Information Systems.
Othman, S. H., Beydoun, G., & Sugumaran, V. (2014). Development and validation of a disaster management metamodel (DMM). Information Processing & Management, 50(2), 235–271.
Article Google Scholar
Samoladas, I., Goussios, G., & Spinellis, D. (2008). The SQO-OSS quality model: measurement based open source software evaluation. In: IFIP International Conference on Open Source Systems. Springer, Boston, MA.
Saaty, T. L. (1980). The analytic hierarchy process: Planning, priority setting and resource allocation. McGraw-Hill.
Google Scholar
Saaty, T. L. (2008). Decision making with the analytic hierarchy process. International Journal of Services Sciences, 1(1), 83–98.
Article Google Scholar
Saaty, T. L., & Sagir, M. (2015). Ranking countries more reliably in the summer olympics. International Journal of the Analytic Hierarchy Process, 7(3), 589–610.
Google Scholar
Salem, I. E. B. (2015). Transformational leadership: Relationship to job stress and job burnout in five-star hotels. Tourism and Hospitality Research, 15(4), 240–253.
Article Google Scholar
SciTools Understand. (2020). URL. https://scitools.com/
Semeteys, R. (2006). Method for qualification and selection of open source software (QSOS), version 1.6. Retrieved: 30 April 2022. URL: tinyurl.com/y2phllex
Silva, D. G., Coutinho, C., & Costa, C. J. (2023). Factors influencing free and open-source software adoption in developing countries—an empirical study. Journal of Open Innovation: Technology, Market, and Complexity, 9(1), 21–33.
Article Google Scholar
Sjoberg, G., Orum, A. M., & Feagin, J. R. (2020). A case for the case study. The University of North Carolina Press.
Soto, M., & Ciolkowski, M. (2009). The QualOSS open source assessment model measuring the performance of open source communities. In: 3rd International Symposium on Empirical Software Engineering and Measurement. IEEE.
Spinellis, D., & Jureczko, M. (May 2011). Metric Description [Online] Available: http://gromit.iiar.pwr.wroc.pl/p_inf/ckjm/
Swedberg, R. (2020). Exploratory research. The Production of Knowledge: Enhancing Progress in Social Science, 17–41.
Tanrıöver, Ö. Ö., & Bilgen, S. (2011). A framework for reviewing domain specific conceptual models. Computer Standards & Interfaces, 33(5), 448–464.
Article Google Scholar
Tassone, J., Xu, S., Wang, C., Chen, J., & Du, W. (2018) Quality assessment of open source software: A review. IEEE/ACIS 17th International Conference on Computer and Information Science (ICIS), 411–416. IEEE.
Vanderose, B., Habra, N., & Kamseu, F. (2010). Towards a model-centric quality assessment. In Proceedings of the 20th International Workshop on Software Measurement (IWSM 2010): Conference on Software Process and Product Measurement (Stuttgart Nov 2010).
Visconti, M., & Cook, C. R. (2002) An overview of industrial software documentation practice. In 12th International Conference of the Chilean Computer Science Society, 2002. Proceedings, 179–186. IEEE.
Wagner, S., Goeb, A., Heinemann, L., Kläs, M., Lampasona, C., Lochmann, K., Mayr, A., Plösch, R., Seidl, A., Streit, J., & Trendowicz, A. (2015). Operationalised product quality models and assessment: The Quamoco approach. Information and Software Technology, 62, 101–123.
Article Google Scholar
Wasserman, M. P., & Chan, C. (2006). Business readiness rating project, BRR Whitepaper RFC 1. URL: tinyurl.com/y5srd5sq
Wasserman, A. I., Guo, X., McMillian, B., Qian, K., Wei, M. Y., & Xu, Q. (2017). OSSpal: Finding and evaluating open source software. In Open Source Systems: Towards Robust Practices: 13th IFIP WG 2.13 International Conference.
Wohlin, C. (2021). Case study research in software engineering—it is a case, and it is a study, but is it a case study? Information and Software Technology, 133, 106514.
Article Google Scholar
Wohlin, C., Runeson, P., Höst, M., Ohlsson, M. C., Regnell, B., & Wesslén, A. (2012). Experimentation in software engineering: An introduction. Springer.
Yalcin, A. S., Kilic, H. S., & Delen, D. (2022). The use of multi-criteria decision-making methods in business analytics: A comprehensive literature review. Technological Forecasting and Social Change, 174, 121193.
Article Google Scholar
Yin, R. K. (2018). The case study research and applications. Sage.
Yılmaz, N., & Tarhan, A. K. (2020). Meta-models for software quality and its evaluation: A systematic literature review. In: International Workshop on Software Measurement and the 15th International Conference on Software Process and Product Measurement, Mexico.
Yılmaz, N., & Tarhan, A. K. (2022a). Quality evaluation models or frameworks for open source software: A systematic literature review. Journal of Software: Evolution and Process, 34(6), e2458. https://doi.org/10.1002/smr.2458
Article Google Scholar
Yılmaz, N., & Tarhan, A. K. (2022b). Matching terms of quality models and meta-models: Toward a unified meta-model of OSS quality. Software Quality Journal, 1–53. https://doi.org/10.1007/s11219-022-09603-3
Yilmaz, N., & Tarhan, A. K. (2022c). Definition of the term used in the SQMM. Zenodo. https://doi.org/10.5281/zenodo.6367596
Article Google Scholar
Yilmaz, N., & Tarhan, A. K. (2023). Supplementary document of the article titled ‘quality evaluation meta-model for open source software. Zenodo. https://doi.org/10.5281/zenodo.7986369
Article Google Scholar
Zhao, Y., Liang, R., Chen, X., & Zou, J. (2021). Evaluation indicators for open-source software: A review. Cybersecurity, 4(1), 1–24.
Article Google Scholar

Download references

Funding

No funding was obtained for this study.

Author information

Authors and Affiliations

Software Engineering Research Group (HUSE), Hacettepe University Graduate School of Science and Engineering, Ankara, Turkey
Nebi Yılmaz
Software Engineering Research Group (HUSE), Hacettepe University Computer Engineering Department, Ankara, Turkey
Ayça Kolukısa Tarhan

Authors

Nebi Yılmaz
View author publications
You can also search for this author in PubMed Google Scholar
Ayça Kolukısa Tarhan
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

NY and AKT conceived the presented idea and designed the empirical studies of validation. NY carried out the empirical studies, discussed the results with AKT, and wrote the manuscript. AKT reviewed the manuscript in several iterations and suggested revisions as necessary.

Corresponding authors

Correspondence to Nebi Yılmaz or Ayça Kolukısa Tarhan.

Ethics declarations

Conflict of interest

The authors declare no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix 1. Final version of OSS-QMM (please refer to Yılmaz & Tarhan, 2022b for detailed explanations of the concepts and relationships)

Appendix 2. New operationalized quality model derived from OSS-QMM

Appendix 3. Mapping of the terms in the existing OSS quality models (OSMM, OpenBRR, OSSpal, and SQO-OSS) to the concepts of the OSS-QMM

OSS-QMM concepts	Terms in OSS quality models
Quality model	OSMM	OpenBRR	OSSpal	SQO-OSS
Viewpoint	Developer	Developer	Developer	Developer
OSS aspect	Community-based	Community-based	Community-based	Code-based	Community-based
Information need	Calculation of developer size to evaluate maintainability	Calculation of developer productivity to evaluate maintainability	Calculation of consulting services quality to evaluate maintainability	Calculation of comment frequency to evaluate maintainability	Calculation of documentation quality to evaluate maintainability
Characteristic	Maintainability	Maintainability	Maintainability	Maintainability	Maintainability
Sub-characteristic	Acceptance	Product quality	Support and service	Analyzability	Analyzability
Entity	Developer	Contributor	Contributor	Source code	Contributor
Quality requirement	The large size of the developer is desirable for maintainability	Productive developers are desirable for maintainability	The active consulting service is desirable for maintainability	The high comment frequency is desirable for maintainability	The large number of documents is desirable for maintainability
Impact	Positive	Positive	Positive	Positive	Positive
Measurable concepts	The size of the developer	Productivity of contributors	The activeness of consulting community	Complexity of source code	Completeness of documentation
Measure	Number of developers (base measure)	Number of releases (base measure)	Number of the consulting community (base measure)	Weighted method per class (WMC) (base measure)	Number of documents (base measure)
Unit	Developer	Release	Consulting community	Methods	Documents
Scale	Integer from zero to five (the score (1–5) is assigned w.r.t. rules given in OSMM)	Integer from zero to three (the score (1–3) is assigned w.r.t. rules given in OpenBRR)	Integer from zero to three (the score (1–5) is assigned w.r.t. rules given in OSSpal)	Integer from zero to infinity	Integer from zero to infinity
Measurement method	Manually	Manually	Manually	Automatically (e.g., Understand Scitool, CKJM, and Intellij IDEA)	Manually
Measurement function	There is no measurement function because it is a base measure	There is no measurement function because it is a base measure	There is no measurement function because it is a base measure	There is no measurement function because it is a base measure	There is no measurement function because it is a base measure

Appendix 4. Design details about case studies: list of selected OSS products, code-based measures, community-based measures, and steps of integrated AHP-TOPSIS method

(a) List of selected OSS products used in the case studies

Product Properties	Apache OFBiz	Adempiere	Compiere
Website	https://ofbiz.apache.org/	https://adempiere.org/	http://www.compiere.com/
Product type	Open-source ERP system	Open-source ERP system	Open-source ERP system
Programming language	Java (1.547.623 LOC)	Java (1.973.229 LOC)	Java (1.402.191 LOC)
First release date	2009	2006	2000

(b) List of code-based measures with their description and measurable concepts associated with each measure (Bakar et al., 2012; Chawla & Chhabra, 2015; Dagpinar & Jahnke, 2003; Dubey & Rana, 2011)

Measurable concept (MC)	Measure	Description
MC1: complexity of source code	WMC: weighted methods per class	The degree of complexity and the number of methods in a class (Hanefi Calp et al., 2011). With the increasing number of methods, the code analyzability time will automatically increase (Chidamber & Kemerer, 1994)
	CC: cyclomatic complexity	Measures the ratio of the flow of the program source code to follow independent paths from one another and is directly related to the complexity of the code. The high value of this metric is undesirable and will affect the source code analyzability
	NNL: number of nested levels	Measures the depth of nesting of the loops in a class, and a higher value of this metric reduces the testability and stability
MC2: comment frequency of source code	NOS: number of statements	Measure the frequency of comments and explanations that will show us the way to reduce the complexity of software. It also facilitates the tracking and resolvability of the program
MC3: inheritance complexity degree of source code	DIT: depth of inheritance tree	Measures the distance of a class to the root of the inheritance tree (Chidamber & Kemerer, 1994). The high depth of the tree increases the complexity since it includes more classes and methods, indicating low changeability and the stability of the software product
MC3: inheritance complexity degree of source code	NOC: number of children	Measures the number of lower classes derived from a class. When the value of this metric is high, it indicates that the value of re-use is higher, more errors may occur, and a higher effort is required during testing (Chidamber & Kemerer, 1994)
MC4: interaction complexity (coupling) degree of source code	CBO: coupling between object classes	Represents the number of classes coupled to a given class. This dependency is a dependency when some properties or methods in the class are used in other classes without inheritance between classes (Chidamber & Kemerer, 1994). High levels of dependence between classes harm the modular design and reduce changeability
	RFC: response for a class	Measure the number of all the methods that can be triggered when calling methods of an object from one class to this object. Namely, the total number of written in a class and method called (Chidamber & Kemerer, 1994). Software products with a lower RFC metric value can be better understood and tested
MC5: cohesion degree of source code	LCOM: lack of cohesion of methods	Measures the degree of similarity of methods with each other (Chidamber & Kemerer, 1994). Therefore, it is desirable to have low values of the metric

(c) List of the community-based measures with their equation and measurable concepts associated with each measure

Measurable concept (MC)	Measure	Measurement functions (equation)
MC6: difficulty degree of bug	*BSI: bug severity index	$\left(\frac{\#\;\mathrm{ of\; blocker}}{{\text{LOC}}}\times 9\right)+\left(\frac{\#\;\mathrm{ of\; critical}}{{\text{LOC}}}\times 7\right)+\left(\frac{\#\;\mathrm{ of\; major}}{{\text{LOC}}}\times 5\right)+\left(\frac{\#\;\mathrm{ of\; minor}}{{\text{LOC}}}\times 3\right)+\left(\frac{\#\;\mathrm{ of\; trivial}}{{\text{LOC}}}\times 1\right)$
MC7: completeness of documentation	ND: number of documents	No equation (it is a base measure)
MC8: the activeness of the community	*CD: commit density	$\frac{(\#\;\mathrm{of\;commit})/(\#\;\mathrm{of\;developer})}{({\text{kLOC}})}$
MC8: the activeness of the community	*ED: email density	$\frac{(\#\;\mathrm{ of \;email})/(\#\;\mathrm{ of\; developer})}{({\text{LOC}})/(\#\;\mathrm{ of\; release})}$
MC9: size of the development community	NC: number of contributors	No equation (it is a base measure)
MC10: performance of contributor	*FRIS: feature request implementation success	$\frac{(\#\;\mathrm{ of \;closed \;feature\; request})/(\#\;\mathrm{ of\; total\; feature\; request})}{({\text{kLOC}})}$
MC10: performance of contributor	*BSSR: bug-solving success rate	$\frac{(\#\;\mathrm{ of \;closed \;bug})/(\#\;\mathrm{ of \;total \;bug})}{({\text{kLOC}})}$
MC11: productivity of contributors	NR: number of releases	No equation (it is a base measure)
MC12: fault proneness of the contributor	*DD: defect density	$\frac{\#\;\mathrm{ of\; total\; defect}}{({\text{LOC}})}$
MC13: maturity of project	PA: product age	No equation (it is a base measure)

(d) Integrated AHP-TOPSIS method used for quality evaluation in the case studies

Appendix 5. List of community-based measures and their descriptions

Measure	Description
Bug severity index (BSI)	Bug severity is a classification of software defect (bug) to indicate the degree of negative impact on the quality of software. It generally consists of the following five levels, from most important to least important: blocker, critical, major, minor, and trivial. The bug reporting database of OSS products is investigated to calculate the severity of bugs. Since the impact of each level of bugs on the product will be different, bugs are weighted according to their severity as follows: blocker = 9, critical = 7, major = 5, minor = 3, and trivial = 1. Accordingly, the number of bugs in each severity level, lines of code (LOC), and the weights given according to the bug severity are used as base measures to calculate BSI via the equation in Appendix 4(c). That is, the number of bugs in each severity level is divided by the size of the product, and the results are multiplied by the weight of each level. Then, the results obtained for each severity level are summed up
Number of document (ND)	Software documentation plays a very important role in all the phases of a software system’s life cycle. In the context of ERP, the relevance of the software documentation is even more important due to the complexity of such a kind of software systems and the strategic role they have within operative organizations. It is not only important from the viewpoint of software engineering but also from the viewpoint of a user. The user needs to know how to use and install an OSS system, import and manage data, etc. In this context, some documents (e.g., a user guide, a technical guide, a database installation guide, a developer guide, API documentation, and Wiki pages) can be available for OSS systems in different formats (e.g., PDF and HTML). To access these documents, the websites of three open-source ERP systems and their different cloud repositories (e.g., SourceForge and GitHub) were visited, and the accessible number of documents was collected. Since ND is already a base measure, there is no need for an equation as indicated in Appendix 4(c)
Commit density (CD)	A commit is an operation that sends the latest changes (e.g., added lines or removed lines) of the source code to the repository. Every change to the source code of an OSS system has a purpose, e.g., to adapt, correct, perfect, or extend the system. After the changes are performed, these commits are stored in a version control system, such as a concurrent version system (CVS) or Git. In this context, the high number of commits may provide information about the activeness of the developers of the product and the high potential to develop with changes. The number of commits, number of developers, and kilo lines of code (kLOC) are used as base measures to calculate commit density via the equation in Appendix 4(c). The high number of developers and product size (i.e., kLOC) are likely to result in an increase in the number of commits. Therefore, these base measures are used to calculate commit density as specified in the equation in Appendix 4(c)
email density (ED)	The mailing lists of the OSS are store stored monthly in the CVS archive and anyone with an interest in development can join the mailing lists. In the case study, we considered the mailing list among developers. It contains different sorts of messages, including technical discussions, proposed changes, automatic notification messages about changes in the code, and problem reports. The number of emails, number of developers, lines of code (LOC), and the number of versions (releases) are used as base measures to calculate email density via the equation in Appendix 4(c). As each version introduces new features, proposed changes and technical discussions among developers will increase, which will trigger an increase in email traffic. The number of developers is also important in calculating this measure, as the number of emails among developers is taken into account. Also, the size of the product (i.e., LOC) is highly likely to be directly proportional to the number of emails. That is, as the size of the product increases, the number of emails is expected to increase as the problems related to the product will also increase. Therefore, these base measures are used to calculate ED as specified in the equation in Appendix 4(c)
Number of contributor (NC)	The saying “many hands make light work” certainly holds true when dealing with an open-source product. If people’s ambitions and interests change, it often causes that person moving onto doing other things. Also, the increase in the number of contributors to the OSS product results in the heterogeneity of the community. The heterogeneity of the community contributes to the quality of OSS products. For example, if the contributors are employees of a small company, there is the risk of the company cutting its support. As a result, the greater the group of contributors the less chance that the OSS product development stalls. Since NC is already a base measure, there is no need for an equation as indicated in Appendix 4(c)
Feature request implementation success (FRIS)	Feedback from OSS users or developers constitutes a vital part of the evolution of OSS projects. In this context, issue tracking systems (ITS) such as Bugzilla serve to request new features or enhancements to the OSS. User or developer express their demand for further development of the OSS product as a feature request in ITS. At this point, developers are required to implement these incoming feature requests to evolve OSS products. Therefore, the success in the implementation of these feature requests is important for the quality of OSS. This measure is directly related to the performance of the developers with respect to feature request implementation success. The number of closed (implemented) feature requests, number of total feature requests, and kilo lines of code (kLOC) are used as base measures to calculate FRIS via the equation in Appendix 4(c)
Bug-solving success rate (BSSR)	A software bug is an error, flaw, failure, or fault in a computer program or system, which causes it to produce an incorrect or unexpected result or to behave in unintended ways. Some issue tracking systems (e.g., Bugzilla, Trac, or OTRS (open-source ticket requesting system)) are used to bug reports of the OSS products. Teams who are more capable or disciplined in handling incoming bugs are generally considered more successful. That is, this measure is directly related to the performance of the developers with respect to bug-solving success. The number of closed (solved) bugs, number of total bugs, and kilo lines of code (kLOC) are used as base measures to calculate BSSR via the equation in Appendix 4(c)
Number of releases (NR)	A software release is a change or set of changes that are created to be delivered to the end-user after further new properties, or enhancements. As expectations from software are constantly changing, many versions of the software may be released over its lifetime. The addition of new properties to the software with new versions and the enhancement of the software can be considered in relation to the productivity of the developers. Despite the fact that the type of release is classified as minor, major, and emergency release, the total number of releases is considered in this case study. Since NR is already a base measure, there is no need for an equation as indicated in Appendix 4(c)
Defect density (DD)	Defect density is the number of defects detected in a software component during a defined period of development operation divided by the size of the software component. Minimizing defect density is important for establishing the time, cost, and quality balance that is critical to the quality of OSS projects. The number of total defects and lines of code (LOC) is used as base measures to calculate DD via the equation in Appendix 4(c)
Product age (PA)	The age of a product is the time from when the product was created to the present. The longer a product remains under active development, the smaller the chance the product’s development will suddenly stop. In this sense, the first year is the biggest challenge and hurdle for open-source initiatives. The initiative is often halted due to the small community that cannot sustain the workload that the product generates. As the developer will not be subject to any financial compensation, it should attract new developers for the OSS product to live for many years. In this context, the longevity of the product is important for product quality. Since PA is already a base measure, there is no need for an equation as indicated in Appendix 4(c)

Appendix 6. Description of evaluation methods with their formulae used in the case studies

Technique	Description	Equation
AHP	The AHP method consists of the following steps. Please see (Ho & Ma, 2018; Saaty, 1980, 2008) for details
Step 1	Firstly, structural hierarchies are created. The concepts of OSS aspect and quality characteristics provide this condition	No equation
Step 2	A pair-wise comparison matrix A (size nxn) is constructed to compare the criteria in pairs. Each OSS aspect and related sub-characteristics are as “criteria.”	${A=[{x}_{ij}]}_{nxn}$ Matrix A is a pair-wise comparison matrix
Step 3	Pair-wise comparisons are performed by comparing the relative importance of two selected criteria. The matrix A is filled by using the scales 1–9, as proposed by Saaty (Saaty, 1980) (see (Ho & Ma, 2018) for details)	A pairwise comparison is performed on matrix A, and the matrix is filled out
Step 4	The matrix A is normalized, and normalized pairwise comparison decision matrix A_norm matrix is obtained. In this formula, each element of matrix A in a column is divided by the sum of the elements in the same column	${{A}_{{\text{norm}}}=[{a}_{ij}]}_{nxn}=\nicefrac{{x}_{ij}}{\sum_{i=1}^{n}{x}_{ij}}$	(1)
Step 5	The final weight of each criterion is calculated	${w}_{i}=\nicefrac{\sum_{j=1}^{n}{a}_{ij}}{n}$ and $\sum_{i=1}^{n}{w}_{i}$ =1 $i, j=\mathrm{1,2},\dots n$	(2)
Step 6	The consistency ratio (CR) is calculated to check the consistency of the decision-maker’s judgment. Firstly, the consistency index (CI) is calculated, where λ_max is the eigenvalue (see (Saaty, 2008; Ho & Ma, 2018) for details) corresponding to the matrix of pair-wise comparisons and n is the number of criteria being compared. Then, CR is calculated. Here, random index (RI) is a value that depends on the number of criteria (n) (see (Saaty, 2008; Ho & Ma, 2018) for values of RI according to n)	${\varvec{C}}{\varvec{I}}=\boldsymbol{ }\left({\lambda }_{{\text{max}}}-n\right)/(n-1)$	(3)
Step 6		${\varvec{C}}{\varvec{R}}= CI/RI$	(4)
Step 7	The final weight of each criterion is approved	No equation
TOPSIS	The final weight of each criterion obtained from the AHP method is used as input to the TOPSIS method. The TOPSIS method consists of the following steps (please see (Hasnain et al., 2020; Chakraborty, 2022; Işıklar & Büyüközkan, 2007) for details)
Step 1	Firstly, decision matrix B = [b_ij]_mxn, where m is alternatives (i.e., OSS products) in the rows and n is evaluation criteria (i.e., measurable concepts) in the columns, is constructed	${B=[{b}_{ij}]}_{mxn}$ Matrix B is the decision matrix
Step 2	Normalized decision matrix R = [r_ij]_mxn is constructed	${\varvec{R}}{=[{r}_{ij}]}_{mxn}={ b}_{ij}/\sqrt{\sum\limits_{{\varvec{i}}=1}^{{\varvec{m}}}{b}_{ij}^{2}}$ $i=\mathrm{1,2},3\dots m;$ and $j=\mathrm{1,2},3\dots n$	(5)
Step 3	The final weights obtained from the AHP method are multiplied by the values of the normalized decision matrix R. Thus, the weighted normalized decision matrix V = [v_ij]_mxn is obtained	${{\varvec{V}}=\left[{v}_{ij}\right]}_{mxn}={w}_{j} x {r}_{ij}$ $i=\mathrm{1,2},3\dots m;$ and $j=\mathrm{1,2},3\dots n$	(6)
Step 4	In this step, two artificial alternatives, A⁺ (the positive ideal solution) and A⁻ (the negative ideal solution), are defined by Eqs. (7) and (8), respectively Here, J is the subset of {I = 1, 2, …, m}, which presents the concept of impact (positive impact) in the OSS-QMM, and J⁻ is the complement set of J	${{\varvec{A}}}^{+}=\{({{\text{max}}}_{i}{v}_{ij }\| j\in J)({{\text{min}}}_{i}{v}_{ij }\|j\in {J}^{-})$ $\left\|i=\mathrm{1,2},3,\dots m \right\}= \{{v}_{1}^{+}, {v}_{2}^{+},\dots {v}_{j}^{+},\dots {v}_{n}^{+}\}$	(7)
Step 4		${{\varvec{A}}}^{-}=\{({{\text{min}}}_{i}{v}_{ij }\| j\in J)({{\text{max}}}_{i}{v}_{ij }\|j\in {J}^{-})$ $\left\|i=\mathrm{1,2},3,\dots m \right\}= \{{v}_{1}^{-}, {v}_{2}^{-},\dots {v}_{j}^{-},\dots {v}_{n}^{-}\}$	(8)
Step 5	In this step, separation measurement is performed by calculating the distance between each alternative in V and the ideal vector A⁺ or the negative ideal A⁻ by using the Euclidean distance, which is given by Eqs. (9) and (10), respectively. At the end of Step 5, two values, namely, S⁺ and S⁻ for each alternative, have been counted. These two values represent the distance between each alternative and both the ideal and negative ideal	${{\varvec{S}}}_{{\varvec{i}}}^{+}= \sqrt{\sum_{j=1}^{n}{\left({v}_{ij}-{v}_{j}^{+}\right)}^{2}}$, $i=\left\{\mathrm{1,2},3\dots m\right\}$	(9)
Step 5		${{\varvec{S}}}_{{\varvec{i}}}^{-}= \sqrt{\sum_{j=1}^{n}{\left({v}_{ij}-{v}_{j}^{-}\right)}^{2}}$, $i=\left\{\mathrm{1,2},3\dots m\right\}$	(10)
Step 6	In this process, the closeness of A_i (ith alternative) to the ideal solution ${A}^{+}$ is defined, as shown in Eq. (11). ${C}_{i }^{}=1$ if and only if ${A}_{i}= {A}^{+}$; similarly, ${C}_{i }^{}=0$ if and only if ${A}_{i}= {A}^{-}$	${{\varvec{C}}}_{{\varvec{i}}\boldsymbol{ }}^{\boldsymbol{}}={S}_{i}^{-}/\left({S}_{i}^{-}+{S}_{i}^{+}\right)$ $0<{C}_{i }^{}<1, i=\left\{\mathrm{1,2},3\dots m\right\}$	(11)
Step 7	The set of alternatives ${A}_{i}$ can now be ranked according to the descending order of ${C}_{i }^{*}$, indicating that a higher value corresponds with better performance	No equation
Weighted distribution	The weight of each sub-characteristic for each OSS aspect can be different (these weights are calculated in the AHP process). Therefore, the final weight of each sub-characteristic as specific to the OSS aspect is calculated Here, ${X}_{i}$ is the final weight of a sub-characteristic for an OSS aspect, ${w}_{i}^{a}$ are the weights of OSS aspects, ${w}_{j}^{s}$ are the weights of OSS sub-characteristics, $i$ is the number of OSS aspects (there are two OSS aspects), and $m$ is the number of sub-characteristics	${X}_{i}={(w}_{i}^{a}*{w}_{j}^{s})/\sum\limits_{i=1}^{n}{w}_{i}^{a}$ $\sum_{i=1}^{n}{w}_{i}^{a}=1$(see Eq. (2)) $i=\left\{1 or 2\right\}j=\left\{\mathrm{1,2},3\dots m\right\}$	(12)
Some math equation	Some mathematical equations are used to obtain derived measures from the base measures in the concept of measurement function. For example, M1 and M2 are base measures, and M3 is a derived measure obtained from M1 and M2 using the following mathematical equation: M3 = M1/(M1 + M2). Therefore, this equation corresponds to the concept of measurement function in the OSS-QMM	It can be a different kind of equations
Average of the measures	In cases where multiple measures are associated with a measurable concept, these measures should be aggregated. The normalized measures (obtained in Step 2 of TOPSIS) associated with a measurable concept are averaged in this aggregation process Here, $p$ is the number of alternatives (OSS product), ${m}_{(k)}$ is a new value of measures associated with a measurable concept for kth alternative, ${r}_{ij}$ is a normalized measure (Step 2 of TOPSIS), and $m, n$ are the first and the last indices of measures associated with a measurable concept, respectively	${m}_{(k)}=\sum\limits_{j=1}^{n}{r}_{ij}/n$ $i =\left\{m\dots n\right\}k=\left\{\mathrm{1,2},3\dots p\right\}$	(13)
Linear utility function	The utility functions for each OSS product can be defined to operationalize the evaluation step. The higher the evaluation value of each of these OSS products and the better it is for software quality, the higher should be the associated utility. To reflect this, simple increasing linear utility functions can be selected with two thresholds, min and max, as shown in Fig. 3	See Fig. 3

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Yılmaz, N., Tarhan, A.K. Quality evaluation meta-model for open-source software: multi-method validation study. Software Qual J 32, 487–541 (2024). https://doi.org/10.1007/s11219-023-09658-w

Download citation

Accepted: 13 December 2023
Published: 06 February 2024
Issue Date: June 2024
DOI: https://doi.org/10.1007/s11219-023-09658-w

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Quality evaluation meta-model for open-source software: multi-method validation study

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Matching terms of quality models and meta-models: toward a unified meta-model of OSS quality

Quality enhancement in OSS development process: a quantitative framework approach

Evaluation indicators for open-source software: a review

Data availability

References

Funding