Skip to main content

Quality evaluation meta-model for open-source software: multi-method validation study

  • Research
  • Published:
Software Quality Journal Aims and scope Submit manuscript

Abstract

In recent years, open-source software (OSS) has attracted increasing attention due to its easy accessibility via cloud repositories, voluntary community, no vendor lock-in, and low total cost of ownership. In turn, specifying and evaluating OSS quality has become a significant challenge for OSS adoption in organizations that are inclined to use them. Although many OSS quality models have been proposed in the literature, the dynamic and diverse nature of OSS has caused these models to be heterogeneous in terms of structure and content. This has adversely affected the standardization of evaluations and led to the evaluation results obtained from different OSS quality models for the same purpose being incomparable and sometimes unreliable. Therefore, in this study, a meta-model for OSS quality (OSS-QMM), which employs a unified structure from existing quality models and enables the derivation of homogeneous models, has been proposed. For this purpose, a systematic and laborious effort has been spent via a step-based meta-model creation process including review-and-revise iterations. In order to validate the OSS-QMM, case study and expert opinion methods have been applied to answer three research questions (RQs) targeted to investigate practical applicability, results comparability, and effectiveness of using the meta-model. Multiple and embedded case study designs have been employed for evaluating three real ERP systems, and 20 subject matter experts have been interviewed during the validation process. The results of multi-faceted empirical studies have indicated that the OSS-QMM has addressed solving problems in OSS quality evaluation and its adoption with high degrees of confidence.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

Data availability

The data that support the findings of this study are openly available in Google Drive and Zenodo at the following URLs:

1. Definition of the terminologies in the SQMM, Zenodo, URL: https://doi.org/10.5281/zenodo.6367596

2. List of questions to obtain feedback from experts (Step 4.4), G. Drive, URL: https://tinyurl.com/2qwowtzh

3. Expert opinion (list of questions and expert answer sheet, Step 5), G. Drive, URL: https://tinyurl.com/2ow7ayua

4. Perform case studies 2 and 3: Zenodo. https://doi.org/10.5281/zenodo.7986369

References

  • Adeoye-Olatunde, O. A., & Olenik, N. L. (2021). Research and scholarly methods: Semi-structured interviews. Journal of the American College of Clinical Pharmacy, 4(10), 1358–1367.

    Article  Google Scholar 

  • Adewumi, A., Misra, S., & Omoragbe, N. (2019). FOSSES: Framework for open- source software evaluation and selection. In: Software: Practice and Experience, 49(5), 780–812.

    Article  Google Scholar 

  • Adewumi, A., Misra, S., Omoragbe, N., Crawford, B., & Soto, R. (2016). A systematic literature review of open source software quality assessment models. SpringerPlus, 5(1), 1936.

    Article  Google Scholar 

  • Al-Dhaqm, A., Razak, S., Othman, S. H., Ngadi, A., Ahmed, M. N., & Ali Mohammed, A. (2017). Development and validation of a database forensic metamodel (DBFM). PloS One, 12(2), e0170793.

    Article  Google Scholar 

  • Alsolai, H., & Roper, M. (2020). A systematic literature review of machine learning techniques for software maintainability prediction. Information and Software Technology, 119, 106214.

    Article  Google Scholar 

  • Ardito, L., Coppola, R., Barbato, L., & Verga, D. (2020). A tool-based perspective on software code maintainability metrics: A systematic literature review. Scientific Programming, 2020.

  • Arthur, J. D., & Stevens, K. T. (1989). Assessing the adequacy of documentation through document quality indicators. In Proceedings. Conference on Software Maintenance, 40–49. IEEE.

  • Aversano, L., & Tortorella, M. (2013). Quality evaluation of floss projects: Application to ERP systems. Information and Software Technology, 55(7), 1260–1276.

    Article  Google Scholar 

  • Aversano, L., Guardabascio, D., & Tortorella, M. (2017). Analysis of the documentation of ERP software projects. Procedia Computer Science, 121, 423–430.

    Article  Google Scholar 

  • Bakar, A. D., Sultan, A. B. M., Zulzalil, H., & Din, J. (2012). Review on ‘maintainability’metrics in open source software. International Review on Computers and Software, 7(3), 903–907.

    Google Scholar 

  • Bayer, J., & Muthig, D. (2006). A view-based approach for improving software documentation practices. 13th Annual IEEE International Symposium and Workshop on Engineering of Computer-Based Systems (ECBS’06) (p. 10). IEEE.

    Chapter  Google Scholar 

  • Beydoun, G., Low, G., Henderson-Sellers, B., Mouratidis, H., Gomez-Sanz, J. J., Pavon, J., & Gonzalez-Perez, C. (2009). FAML: A generic metamodel for MAS development. IEEE Transactions on Software Engineering, 35(6), 841–863.

    Article  Google Scholar 

  • Boehm, B. W., Brown, H., & Lipow, M. (1978). Quantitative evaluation of software quality. In Proceedings of the 2nd International Conference on Software Engineering, 592–605.

  • Briand, L., Morasca, S., & Basili, V. (2002). An operational process for goal driven definition of measures. IEEE Transactions on Software Engineering, 28(12), 1106–1125.

    Article  Google Scholar 

  • Brings, J., Daun, M., Keller, K., Obe, P. A., & Weyer, T. (2020). A systematic map on verification and validation of emergent behavior in software engineering research. Future Generation Computer Systems, 112, 1010–1037.

    Article  Google Scholar 

  • Butler, S., Gamalielsson, J., Lundell, B., Brax, C., Mattsson, A., Gustavsson, T., & Lönroth, E. (2022). Considerations and challenges for the adoption of open source components in software-intensive businesses. Journal of Systems and Software, 186, 111152.

    Article  Google Scholar 

  • Chakraborty, S. (2022). TOPSIS and modified TOPSIS: A comparative analysis. Decision Analytics Journal, 2, 100021.

    Article  Google Scholar 

  • Chawla, M. K., & Chhabra, I. (2015, October). Sqmma: Software quality model for maintainability analysis. In Proceedings of the 8th Annual ACM India Conference, 9–17

  • Chidamber, S. R., & Kemerer, C. F. (1994). A metrics suite for object oriented design. IEEE Transactions on Software Engineering, 20(6), 476–493.

    Article  Google Scholar 

  • Codemetrics. (2019). URL: https://plugins.jetbrains.com/plugin/12159-codemetrics

  • Dagpinar, M., & Jahnke, J. H. (2003, November). Predicting maintainability with object-oriented metrics-an empirical comparison. In 10th Working Conference on Reverse Engineering, 2003. WCRE 2003. Proceedings, 155–155. IEEE Computer Society.

  • Dromey, R. G. (1995). A model for software product quality. IEEE Transactions on Software Engineering, 21(2), 146–162.

    Article  Google Scholar 

  • Dubey, S. K., & Rana, A. (2011). Assessment of maintainability metrics for object-oriented software system. ACM SIGSOFT Software Engineering Notes, 36(5), 1–7.

    Article  Google Scholar 

  • Duijnhouwer, F. W., & Widdows, C. (2003). Capgemini expert letter open source maturity model. Retrieved: 30 April 2022. Capgemini. URL: tinyurl.com/yxdbvjk6

  • Dweiri, F., Kumar, S., Khan, S. A., & Jain, V. (2016). Designing an integrated AHP based decision support system for supplier selection in automotive industry. Expert Systems with Applications, 62, 273–283.

    Article  Google Scholar 

  • Eghan, E. E., Alqahtani, S. S., Forbes, C., & Rilling, J. (2019). API trustworthiness: An ontological approach for software library adoption. Software Quality Journal, 27(3), 969–1014.

    Article  Google Scholar 

  • Frantz, R. Z., Rehbein, M. H., Berlezi, R., & Roos-Frantz, F. (2019). Ranking open source application integration frameworks based on maintainability metrics: A review of five-year evolution. Software: Practice and Experience, 49(10), 1531–1549.

    Google Scholar 

  • Garcia, F., Bertoa, M. F., Calero, C., Vallecillo, A., Ruiz, F., Piattini, M., & Genero, M. (2006). Towards a consistent terminology for software measurement. Information and Software Technology, 48(8), 631–644.

    Article  Google Scholar 

  • Gezici, B., Özdemir, N., Yılmaz, N., Coşkun, E., Tarhan, A., & Chouseinoglou, O. (2019). Quality and success in open source software: A systematic mapping. 45th Euromicro Conference on Software Engineering and Advanced Applications (SEAA), 363–370. IEEE.

  • Goeb, A. (2013). A meta model for software architecture conformance and quality assessment. Electronic Communications of the EASST, 60.

  • Grady, R. B. (1992). Practical software metrics for project management and process improvement. Prentice Hall.

    Google Scholar 

  • Hanefi Calp, M., Arici, N., Enstitüsü, B., Üniversitesi, G., & Ankara, T. (2011). Nesne Yönelimli Tasarım Metrikleri ve Kalite Özellikleriyle İlişkisi. Politeknik Dergisi Journal of Polytechnic Cilt Digital Object Identifier, 14141(10), 9–14.

    Google Scholar 

  • Hanine, M., Boutkhoum, O., Tikniouine, A., & Agouti, T. (2016). Application of an integrated multi-criteria decision making AHP-TOPSIS methodology for ETL software selection. Springerplus, 5(1), 1–17.

    Article  Google Scholar 

  • Hasnain, S., Ali, M. K., Akhter, J., Ahmed, B., & Abbas, N. (2020). Selection of an industrial boiler for a soda-ash production plant using analytical hierarchy process and TOPSIS approaches. Case Studies in Thermal Engineering, 19, 100636.

    Article  Google Scholar 

  • Hauge, O., Osterlie, T., & Sorensen, C. F. (2009) An empirical study on selection of open source software-preliminary results. In: ICSE Workshop on Emerging Trends in Free/Libre/Open Source Software Research and Development. IEEE.

  • Hmood, A., Keivanloo, I., & Rilling, J. (2012, July). SE-EQUAM-an evolvable quality metamodel. In 2012 IEEE 36th Annual Computer Software and Applications Conference Workshops, 334–339. IEEE

  • Ho, W., & Ma, X. (2018). The state-of-the-art integrations and applications of the analytic hierarchy process. European Journal of Operational Research, 267(2), 399–414.

    Article  MathSciNet  Google Scholar 

  • IEEE standard glossary of software engineering terminology. (1990). IEEE Standart 610.12-1990. pp. 1–84.

  • IEEE Standard for a Software Quality Metrics Methodology. (1998). In IEEE Standart 1061–1998.

  • Işıklar, G., & Büyüközkan, G. (2007). Using a multi-criteria decision making approach to evaluate mobile phone alternatives. Computer Standards & Interfaces, 29(2), 265–274.

    Article  Google Scholar 

  • ISO, International Standard ISO VIM. (1993). International vocabulary of basic and general terms in metrology, International Standards Organization, Geneva, Switzerland, second edition.

  • ISO/IEC 14598-3:1999. (1999). Information technology-software product evaluation-Part 3: Process for developers. International Organization for Standardization, Geneva.

  • ISO/IEC 15939:2007. (2007). Information Technology—Software Engineering—Software Measurement Process. International Organization for Standardization, Geneva.

  • ISO/IEC 9126-1:2001. (2001). Software engineering - Product quality -Part 1: Quality model, international organization for standardization, Geneva, Switzerland.

  • Jha, S., Kumar, R., Abdel-Basset, M., Priyadarshini, I., Sharma, R., & Long, H. V. (2019). Deep learning approach for software maintainability metrics prediction. Ieee Access, 7, 61840–61855.

    Article  Google Scholar 

  • Jiang, S., Cao, J., & Qi, Q. (2021). Exploring development-related factors affecting the popularity of open source software projects. In 2021 IEEE 24th International Conference on Computer Supported Cooperative Work in Design (CSCWD), 244–249. IEEE.

  • Joshi, A., Kale, S., Chandel, S., & Pal, D. K. (2015). Likert scale: Explored and explained. British Journal of Applied Science & Technology, 7(4), 396.

    Article  Google Scholar 

  • Khashei-Siuki, A., & Sharifan, H. (2020). Comparison of AHP and FAHP methods in determining suitable areas for drinking water harvesting in Birjand aquifer. Iran. Groundwater for Sustainable Development, 10, 100328.

    Article  Google Scholar 

  • Khatri, S. K., & Singh, I. (2016). Evaluation of open source software and improving its quality. 5th International Conference on Reliability, Infocom Technologies and Optimization (ICRITO). IEEE.

  • Kim, H. M. (1999). Representing and reasoning about quality using enterprise models. PhD thesis, Dept. Mechanical and Industrial Engineering, University of Toronto, Canada.

  • Kitchenham, B., Hughes, R. T., & Linkman, S. G. (2001). Modeling software measurement data. IEEE Transactions on Software Engineering, 27(9), 788–804.

    Article  Google Scholar 

  • Kläs, M., Lampasona, C., Nunnenmacher, S., Wagner, S., Herrmannsdörfer, M., & Lochmann, K. (2010). How to evaluate meta-models for software quality. In Proceedings of the 20th International Workshop on Software Measurement.

  • Lenarduzzi, V., Taibi, D., Tosi, D., Lavazza, L., & Morasca, S. (2020). Open source software evaluation, selection, and adoption: A systematic literature review. In: 46th Euromicro Conference on Software Engineering and Advanced Applications (SEAA). IEEE

  • Li, J., Conradi, R., Bunse, C., Torchiano, M., Slyngstad, O. P. N., & Morisio, M. (2009). Development with off-the-shelf components: 10 facts. IEEE Software, 26(2), 80–87.

    Article  Google Scholar 

  • List of questions and expert opinion (Step-5). (2023). Retrieved: 20 June 2023. URL: https://tinyurl.com/2ow7ayua

  • Magaldi, D., & Berler, M. (2020). Semi-structured interviews. Encyclopedia of Personality and Individual Differences, 4825–4830.

  • Mc Call, J. A., Richards, P. K., & Walters, G. F. (1977). Factors in software quality, volumes I, II, and III. US Rome Air Development Center Reports, US Department of Commerce, USA.

  • Mens, T., Doctors, L., Habra, N., Vanderose, B., & Kamseu, F. (2011). Qualgen: Modeling and analysing the quality of evolving software systems. In: 15th European Conference on Software Maintenance and Reengineering. IEEE.

  • MetricsReloaded. (2004). URL: https://plugins.jetbrains.com/plugin/93-metricsreloaded

  • Nistala, P., Nori, K. V., & Reddy, R. (2019). Software quality models: A systematic mapping study. International Conference on Software and System Processes (ICSSP), 125–134. IEEE.

  • Object Management Group (OMG). (2019). Meta Object Facility (MOF). Core specification version 2.5.1. Retrieved: 2 October 2022. URL: https://www.omg.org/spec/MOF/2.5.1/PDF

  • Othman, S. H., & Beydoun, G. (2010). Metamodelling approach to support disaster management knowledge sharing. In: 21st Australasian Conference on Information Systems.

  • Othman, S. H., Beydoun, G., & Sugumaran, V. (2014). Development and validation of a disaster management metamodel (DMM). Information Processing & Management, 50(2), 235–271.

    Article  Google Scholar 

  • Samoladas, I., Goussios, G., & Spinellis, D. (2008). The SQO-OSS quality model: measurement based open source software evaluation. In: IFIP International Conference on Open Source Systems. Springer, Boston, MA.

  • Saaty, T. L. (1980). The analytic hierarchy process: Planning, priority setting and resource allocation. McGraw-Hill.

    Google Scholar 

  • Saaty, T. L. (2008). Decision making with the analytic hierarchy process. International Journal of Services Sciences, 1(1), 83–98.

    Article  Google Scholar 

  • Saaty, T. L., & Sagir, M. (2015). Ranking countries more reliably in the summer olympics. International Journal of the Analytic Hierarchy Process, 7(3), 589–610.

    Google Scholar 

  • Salem, I. E. B. (2015). Transformational leadership: Relationship to job stress and job burnout in five-star hotels. Tourism and Hospitality Research, 15(4), 240–253.

    Article  Google Scholar 

  • SciTools Understand. (2020). URL. https://scitools.com/

  • Semeteys, R. (2006). Method for qualification and selection of open source software (QSOS), version 1.6. Retrieved: 30 April 2022. URL: tinyurl.com/y2phllex

  • Silva, D. G., Coutinho, C., & Costa, C. J. (2023). Factors influencing free and open-source software adoption in developing countries—an empirical study. Journal of Open Innovation: Technology, Market, and Complexity, 9(1), 21–33.

    Article  Google Scholar 

  • Sjoberg, G., Orum, A. M., & Feagin, J. R. (2020). A case for the case study. The University of North Carolina Press.

  • Soto, M., & Ciolkowski, M. (2009). The QualOSS open source assessment model measuring the performance of open source communities. In: 3rd International Symposium on Empirical Software Engineering and Measurement. IEEE.

  • Spinellis, D., & Jureczko, M. (May 2011). Metric Description [Online] Available: http://gromit.iiar.pwr.wroc.pl/p_inf/ckjm/

  • Swedberg, R. (2020). Exploratory research. The Production of Knowledge: Enhancing Progress in Social Science, 17–41.

  • Tanrıöver, Ö. Ö., & Bilgen, S. (2011). A framework for reviewing domain specific conceptual models. Computer Standards & Interfaces, 33(5), 448–464.

    Article  Google Scholar 

  • Tassone, J., Xu, S., Wang, C., Chen, J., & Du, W. (2018) Quality assessment of open source software: A review. IEEE/ACIS 17th International Conference on Computer and Information Science (ICIS), 411–416. IEEE.

  • Vanderose, B., Habra, N., & Kamseu, F. (2010). Towards a model-centric quality assessment. In Proceedings of the 20th International Workshop on Software Measurement (IWSM 2010): Conference on Software Process and Product Measurement (Stuttgart Nov 2010).

  • Visconti, M., & Cook, C. R. (2002) An overview of industrial software documentation practice. In 12th International Conference of the Chilean Computer Science Society, 2002. Proceedings, 179–186. IEEE.

  • Wagner, S., Goeb, A., Heinemann, L., Kläs, M., Lampasona, C., Lochmann, K., Mayr, A., Plösch, R., Seidl, A., Streit, J., & Trendowicz, A. (2015). Operationalised product quality models and assessment: The Quamoco approach. Information and Software Technology, 62, 101–123.

    Article  Google Scholar 

  • Wasserman, M. P., & Chan, C. (2006). Business readiness rating project, BRR Whitepaper RFC 1. URL: tinyurl.com/y5srd5sq

  • Wasserman, A. I., Guo, X., McMillian, B., Qian, K., Wei, M. Y., & Xu, Q. (2017). OSSpal: Finding and evaluating open source software. In Open Source Systems: Towards Robust Practices: 13th IFIP WG 2.13 International Conference.

  • Wohlin, C. (2021). Case study research in software engineering—it is a case, and it is a study, but is it a case study? Information and Software Technology, 133, 106514.

    Article  Google Scholar 

  • Wohlin, C., Runeson, P., Höst, M., Ohlsson, M. C., Regnell, B., & Wesslén, A. (2012). Experimentation in software engineering: An introduction. Springer.

  • Yalcin, A. S., Kilic, H. S., & Delen, D. (2022). The use of multi-criteria decision-making methods in business analytics: A comprehensive literature review. Technological Forecasting and Social Change, 174, 121193.

    Article  Google Scholar 

  • Yin, R. K. (2018). The case study research and applications. Sage.

  • Yılmaz, N., & Tarhan, A. K. (2020). Meta-models for software quality and its evaluation: A systematic literature review. In: International Workshop on Software Measurement and the 15th International Conference on Software Process and Product Measurement, Mexico.

  • Yılmaz, N., & Tarhan, A. K. (2022a). Quality evaluation models or frameworks for open source software: A systematic literature review. Journal of Software: Evolution and Process, 34(6), e2458. https://doi.org/10.1002/smr.2458

    Article  Google Scholar 

  • Yılmaz, N., & Tarhan, A. K. (2022b). Matching terms of quality models and meta-models: Toward a unified meta-model of OSS quality. Software Quality Journal, 1–53. https://doi.org/10.1007/s11219-022-09603-3

  • Yilmaz, N., & Tarhan, A. K. (2022c). Definition of the term used in the SQMM. Zenodo. https://doi.org/10.5281/zenodo.6367596

    Article  Google Scholar 

  • Yilmaz, N., & Tarhan, A. K. (2023). Supplementary document of the article titled ‘quality evaluation meta-model for open source software. Zenodo. https://doi.org/10.5281/zenodo.7986369

    Article  Google Scholar 

  • Zhao, Y., Liang, R., Chen, X., & Zou, J. (2021). Evaluation indicators for open-source software: A review. Cybersecurity, 4(1), 1–24.

    Article  Google Scholar 

Download references

Funding

No funding was obtained for this study.

Author information

Authors and Affiliations

Authors

Contributions

NY and AKT conceived the presented idea and designed the empirical studies of validation. NY carried out the empirical studies, discussed the results with AKT, and wrote the manuscript. AKT reviewed the manuscript in several iterations and suggested revisions as necessary.

Corresponding authors

Correspondence to Nebi Yılmaz or Ayça Kolukısa Tarhan.

Ethics declarations

Conflict of interest

The authors declare no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix 1. Final version of OSS-QMM (please refer to Yılmaz & Tarhan, 2022b for detailed explanations of the concepts and relationships)

figure c

Appendix 2. New operationalized quality model derived from OSS-QMM

figure d

Appendix 3. Mapping of the terms in the existing OSS quality models (OSMM, OpenBRR, OSSpal, and SQO-OSS) to the concepts of the OSS-QMM

OSS-QMM concepts

Terms in OSS quality models

Quality model

OSMM

OpenBRR

OSSpal

SQO-OSS

Viewpoint

Developer

Developer

Developer

Developer

OSS aspect

Community-based

Community-based

Community-based

Code-based

Community-based

Information need

Calculation of developer size to evaluate maintainability

Calculation of developer productivity to evaluate maintainability

Calculation of consulting services quality to evaluate maintainability

Calculation of comment frequency to evaluate maintainability

Calculation of documentation quality to evaluate maintainability

Characteristic

Maintainability

Maintainability

Maintainability

Maintainability

Maintainability

Sub-characteristic

Acceptance

Product quality

Support and service

Analyzability

Analyzability

Entity

Developer

Contributor

Contributor

Source code

Contributor

Quality requirement

The large size of the developer is desirable for maintainability

Productive developers are desirable for maintainability

The active consulting service is desirable for maintainability

The high comment frequency is desirable for maintainability

The large number of documents is desirable for maintainability

Impact

Positive

Positive

Positive

Positive

Positive

Measurable concepts

The size of the developer

Productivity of contributors

The activeness of consulting community

Complexity of source code

Completeness of documentation

Measure

Number of developers (base measure)

Number of releases (base measure)

Number of the consulting community (base measure)

Weighted method per class (WMC) (base measure)

Number of documents (base measure)

Unit

Developer

Release

Consulting community

Methods

Documents

Scale

Integer from zero to five (the score (1–5) is assigned w.r.t. rules given in OSMM)

Integer from zero to three (the score (1–3) is assigned w.r.t. rules given in OpenBRR)

Integer from zero to three (the score (1–5) is assigned w.r.t. rules given in OSSpal)

Integer from zero to infinity

Integer from zero to infinity

Measurement method

Manually

Manually

Manually

Automatically (e.g., Understand Scitool, CKJM, and Intellij IDEA)

Manually

Measurement function

There is no measurement function because it is a base measure

There is no measurement function because it is a base measure

There is no measurement function because it is a base measure

There is no measurement function because it is a base measure

There is no measurement function because it is a base measure

Appendix 4. Design details about case studies: list of selected OSS products, code-based measures, community-based measures, and steps of integrated AHP-TOPSIS method

(a) List of selected OSS products used in the case studies

Product Properties

Apache OFBiz

Adempiere

Compiere

Website

https://ofbiz.apache.org/

https://adempiere.org/

http://www.compiere.com/

Product type

Open-source ERP system

Open-source ERP system

Open-source ERP system

Programming language

Java (1.547.623 LOC)

Java (1.973.229 LOC)

Java (1.402.191 LOC)

First release date

2009

2006

2000

(b) List of code-based measures with their description and measurable concepts associated with each measure (Bakar et al., 2012; Chawla & Chhabra, 2015; Dagpinar & Jahnke, 2003; Dubey & Rana, 2011)

Measurable concept (MC)

Measure

Description

MC1: complexity of source code

WMC: weighted methods per class

The degree of complexity and the number of methods in a class (Hanefi Calp et al., 2011). With the increasing number of methods, the code analyzability time will automatically increase (Chidamber & Kemerer, 1994)

CC: cyclomatic complexity

Measures the ratio of the flow of the program source code to follow independent paths from one another and is directly related to the complexity of the code. The high value of this metric is undesirable and will affect the source code analyzability

NNL: number of nested levels

Measures the depth of nesting of the loops in a class, and a higher value of this metric reduces the testability and stability

MC2: comment frequency of source code

NOS: number of statements

Measure the frequency of comments and explanations that will show us the way to reduce the complexity of software. It also facilitates the tracking and resolvability of the program

MC3: inheritance complexity degree of source code

DIT: depth of inheritance tree

Measures the distance of a class to the root of the inheritance tree (Chidamber & Kemerer, 1994). The high depth of the tree increases the complexity since it includes more classes and methods, indicating low changeability and the stability of the software product

NOC: number of children

Measures the number of lower classes derived from a class. When the value of this metric is high, it indicates that the value of re-use is higher, more errors may occur, and a higher effort is required during testing (Chidamber & Kemerer, 1994)

MC4: interaction complexity (coupling) degree of source code

CBO: coupling between object classes

Represents the number of classes coupled to a given class. This dependency is a dependency when some properties or methods in the class are used in other classes without inheritance between classes (Chidamber & Kemerer, 1994). High levels of dependence between classes harm the modular design and reduce changeability

RFC: response for a class

Measure the number of all the methods that can be triggered when calling methods of an object from one class to this object. Namely, the total number of written in a class and method called (Chidamber & Kemerer, 1994). Software products with a lower RFC metric value can be better understood and tested

MC5: cohesion degree of source code

LCOM: lack of cohesion of methods

Measures the degree of similarity of methods with each other (Chidamber & Kemerer, 1994). Therefore, it is desirable to have low values of the metric

(c) List of the community-based measures with their equation and measurable concepts associated with each measure

Measurable concept (MC)

Measure

Measurement functions (equation)

MC6: difficulty degree of bug

*BSI: bug severity index

\(\left(\frac{\#\;\mathrm{ of\; blocker}}{{\text{LOC}}}\times 9\right)+\left(\frac{\#\;\mathrm{ of\; critical}}{{\text{LOC}}}\times 7\right)+\left(\frac{\#\;\mathrm{ of\; major}}{{\text{LOC}}}\times 5\right)+\left(\frac{\#\;\mathrm{ of\; minor}}{{\text{LOC}}}\times 3\right)+\left(\frac{\#\;\mathrm{ of\; trivial}}{{\text{LOC}}}\times 1\right)\)

MC7: completeness of documentation

ND: number of documents

No equation (it is a base measure)

MC8: the activeness of the community

*CD: commit density

\(\frac{(\#\;\mathrm{of\;commit})/(\#\;\mathrm{of\;developer})}{({\text{kLOC}})}\)

*ED: email density

\(\frac{(\#\;\mathrm{ of \;email})/(\#\;\mathrm{ of\; developer})}{({\text{LOC}})/(\#\;\mathrm{ of\; release})}\)

MC9: size of the development community

NC: number of contributors

No equation (it is a base measure)

MC10: performance of contributor

*FRIS: feature request implementation success

\(\frac{(\#\;\mathrm{ of \;closed \;feature\; request})/(\#\;\mathrm{ of\; total\; feature\; request})}{({\text{kLOC}})}\)

*BSSR: bug-solving success rate

\(\frac{(\#\;\mathrm{ of \;closed \;bug})/(\#\;\mathrm{ of \;total \;bug})}{({\text{kLOC}})}\)

MC11: productivity of contributors

NR: number of releases

No equation (it is a base measure)

MC12: fault proneness of the contributor

*DD: defect density

\(\frac{\#\;\mathrm{ of\; total\; defect}}{({\text{LOC}})}\)

MC13: maturity of project

PA: product age

No equation (it is a base measure)

(d) Integrated AHP-TOPSIS method used for quality evaluation in the case studies

figure e

Appendix 5. List of community-based measures and their descriptions

Measure

Description

Bug severity index (BSI)

Bug severity is a classification of software defect (bug) to indicate the degree of negative impact on the quality of software. It generally consists of the following five levels, from most important to least important: blocker, critical, major, minor, and trivial. The bug reporting database of OSS products is investigated to calculate the severity of bugs. Since the impact of each level of bugs on the product will be different, bugs are weighted according to their severity as follows: blocker = 9, critical = 7, major = 5, minor = 3, and trivial = 1. Accordingly, the number of bugs in each severity level, lines of code (LOC), and the weights given according to the bug severity are used as base measures to calculate BSI via the equation in Appendix 4(c). That is, the number of bugs in each severity level is divided by the size of the product, and the results are multiplied by the weight of each level. Then, the results obtained for each severity level are summed up

Number of document (ND)

Software documentation plays a very important role in all the phases of a software system’s life cycle. In the context of ERP, the relevance of the software documentation is even more important due to the complexity of such a kind of software systems and the strategic role they have within operative organizations. It is not only important from the viewpoint of software engineering but also from the viewpoint of a user. The user needs to know how to use and install an OSS system, import and manage data, etc. In this context, some documents (e.g., a user guide, a technical guide, a database installation guide, a developer guide, API documentation, and Wiki pages) can be available for OSS systems in different formats (e.g., PDF and HTML). To access these documents, the websites of three open-source ERP systems and their different cloud repositories (e.g., SourceForge and GitHub) were visited, and the accessible number of documents was collected. Since ND is already a base measure, there is no need for an equation as indicated in Appendix 4(c)

Commit density (CD)

A commit is an operation that sends the latest changes (e.g., added lines or removed lines) of the source code to the repository. Every change to the source code of an OSS system has a purpose, e.g., to adapt, correct, perfect, or extend the system. After the changes are performed, these commits are stored in a version control system, such as a concurrent version system (CVS) or Git. In this context, the high number of commits may provide information about the activeness of the developers of the product and the high potential to develop with changes. The number of commits, number of developers, and kilo lines of code (kLOC) are used as base measures to calculate commit density via the equation in Appendix 4(c). The high number of developers and product size (i.e., kLOC) are likely to result in an increase in the number of commits. Therefore, these base measures are used to calculate commit density as specified in the equation in Appendix 4(c)

email density (ED)

The mailing lists of the OSS are store stored monthly in the CVS archive and anyone with an interest in development can join the mailing lists. In the case study, we considered the mailing list among developers. It contains different sorts of messages, including technical discussions, proposed changes, automatic notification messages about changes in the code, and problem reports. The number of emails, number of developers, lines of code (LOC), and the number of versions (releases) are used as base measures to calculate email density via the equation in Appendix 4(c). As each version introduces new features, proposed changes and technical discussions among developers will increase, which will trigger an increase in email traffic. The number of developers is also important in calculating this measure, as the number of emails among developers is taken into account. Also, the size of the product (i.e., LOC) is highly likely to be directly proportional to the number of emails. That is, as the size of the product increases, the number of emails is expected to increase as the problems related to the product will also increase. Therefore, these base measures are used to calculate ED as specified in the equation in Appendix 4(c)

Number of contributor (NC)

The saying “many hands make light work” certainly holds true when dealing with an open-source product. If people’s ambitions and interests change, it often causes that person moving onto doing other things. Also, the increase in the number of contributors to the OSS product results in the heterogeneity of the community. The heterogeneity of the community contributes to the quality of OSS products. For example, if the contributors are employees of a small company, there is the risk of the company cutting its support. As a result, the greater the group of contributors the less chance that the OSS product development stalls. Since NC is already a base measure, there is no need for an equation as indicated in Appendix 4(c)

Feature request implementation success (FRIS)

Feedback from OSS users or developers constitutes a vital part of the evolution of OSS projects. In this context, issue tracking systems (ITS) such as Bugzilla serve to request new features or enhancements to the OSS. User or developer express their demand for further development of the OSS product as a feature request in ITS. At this point, developers are required to implement these incoming feature requests to evolve OSS products. Therefore, the success in the implementation of these feature requests is important for the quality of OSS. This measure is directly related to the performance of the developers with respect to feature request implementation success. The number of closed (implemented) feature requests, number of total feature requests, and kilo lines of code (kLOC) are used as base measures to calculate FRIS via the equation in Appendix 4(c)

Bug-solving success rate (BSSR)

A software bug is an error, flaw, failure, or fault in a computer program or system, which causes it to produce an incorrect or unexpected result or to behave in unintended ways. Some issue tracking systems (e.g., Bugzilla, Trac, or OTRS (open-source ticket requesting system)) are used to bug reports of the OSS products. Teams who are more capable or disciplined in handling incoming bugs are generally considered more successful. That is, this measure is directly related to the performance of the developers with respect to bug-solving success. The number of closed (solved) bugs, number of total bugs, and kilo lines of code (kLOC) are used as base measures to calculate BSSR via the equation in Appendix 4(c)

Number of releases (NR)

A software release is a change or set of changes that are created to be delivered to the end-user after further new properties, or enhancements. As expectations from software are constantly changing, many versions of the software may be released over its lifetime. The addition of new properties to the software with new versions and the enhancement of the software can be considered in relation to the productivity of the developers. Despite the fact that the type of release is classified as minor, major, and emergency release, the total number of releases is considered in this case study. Since NR is already a base measure, there is no need for an equation as indicated in Appendix 4(c)

Defect density (DD)

Defect density is the number of defects detected in a software component during a defined period of development operation divided by the size of the software component. Minimizing defect density is important for establishing the time, cost, and quality balance that is critical to the quality of OSS projects. The number of total defects and lines of code (LOC) is used as base measures to calculate DD via the equation in Appendix 4(c)

Product age (PA)

The age of a product is the time from when the product was created to the present. The longer a product remains under active development, the smaller the chance the product’s development will suddenly stop. In this sense, the first year is the biggest challenge and hurdle for open-source initiatives. The initiative is often halted due to the small community that cannot sustain the workload that the product generates. As the developer will not be subject to any financial compensation, it should attract new developers for the OSS product to live for many years. In this context, the longevity of the product is important for product quality. Since PA is already a base measure, there is no need for an equation as indicated in Appendix 4(c)

Appendix 6. Description of evaluation methods with their formulae used in the case studies

Technique

Description

Equation

 

AHP

The AHP method consists of the following steps. Please see (Ho & Ma, 2018; Saaty, 1980, 2008) for details

  

Step 1

Firstly, structural hierarchies are created. The concepts of OSS aspect and quality characteristics provide this condition

No equation

 

Step 2

A pair-wise comparison matrix A (size nxn) is constructed to compare the criteria in pairs. Each OSS aspect and related sub-characteristics are as “criteria.”

\({A=[{x}_{ij}]}_{nxn}\)

Matrix A is a pair-wise comparison matrix

 

Step 3

Pair-wise comparisons are performed by comparing the relative importance of two selected criteria. The matrix A is filled by using the scales 1–9, as proposed by Saaty (Saaty, 1980) (see (Ho & Ma, 2018) for details)

A pairwise comparison is performed on matrix A, and the matrix is filled out

 

Step 4

The matrix A is normalized, and normalized pairwise comparison decision matrix Anorm matrix is obtained. In this formula, each element of matrix A in a column is divided by the sum of the elements in the same column

\({{A}_{{\text{norm}}}=[{a}_{ij}]}_{nxn}=\nicefrac{{x}_{ij}}{\sum_{i=1}^{n}{x}_{ij}}\)

(1)

Step 5

The final weight of each criterion is calculated

\({w}_{i}=\nicefrac{\sum_{j=1}^{n}{a}_{ij}}{n}\) and \(\sum_{i=1}^{n}{w}_{i}\) =1

\(i, j=\mathrm{1,2},\dots n\)

(2)

Step 6

The consistency ratio (CR) is calculated to check the consistency of the decision-maker’s judgment. Firstly, the consistency index (CI) is calculated, where λmax is the eigenvalue (see (Saaty, 2008; Ho & Ma, 2018) for details) corresponding to the matrix of pair-wise comparisons and n is the number of criteria being compared. Then, CR is calculated. Here, random index (RI) is a value that depends on the number of criteria (n) (see (Saaty, 2008; Ho & Ma, 2018) for values of RI according to n)

\({\varvec{C}}{\varvec{I}}=\boldsymbol{ }\left({\lambda }_{{\text{max}}}-n\right)/(n-1)\)

(3)

\({\varvec{C}}{\varvec{R}}= CI/RI\)

(4)

Step 7

The final weight of each criterion is approved

No equation

 

TOPSIS

The final weight of each criterion obtained from the AHP method is used as input to the TOPSIS method. The TOPSIS method consists of the following steps (please see (Hasnain et al., 2020; Chakraborty, 2022; Işıklar & Büyüközkan, 2007) for details)

  

Step 1

Firstly, decision matrix B = [bij]mxn, where m is alternatives (i.e., OSS products) in the rows and n is evaluation criteria (i.e., measurable concepts) in the columns, is constructed

\({B=[{b}_{ij}]}_{mxn}\)

Matrix B is the decision matrix

 

Step 2

Normalized decision matrix R = [rij]mxn is constructed

\({\varvec{R}}{=[{r}_{ij}]}_{mxn}={ b}_{ij}/\sqrt{\sum\limits_{{\varvec{i}}=1}^{{\varvec{m}}}{b}_{ij}^{2}}\)

\(i=\mathrm{1,2},3\dots m;\) and \(j=\mathrm{1,2},3\dots n\)

(5)

Step 3

The final weights obtained from the AHP method are multiplied by the values of the normalized decision matrix R. Thus, the weighted normalized decision matrix V = [vij]mxn is obtained

\({{\varvec{V}}=\left[{v}_{ij}\right]}_{mxn}={w}_{j} x {r}_{ij}\)

\(i=\mathrm{1,2},3\dots m;\) and \(j=\mathrm{1,2},3\dots n\)

(6)

Step 4

In this step, two artificial alternatives, A+ (the positive ideal solution) and A (the negative ideal solution), are defined by Eqs. (7) and (8), respectively

Here, J is the subset of {I = 1, 2, …, m}, which presents the concept of impact (positive impact) in the OSS-QMM, and J is the complement set of J

\({{\varvec{A}}}^{+}=\{({{\text{max}}}_{i}{v}_{ij }| j\in J)({{\text{min}}}_{i}{v}_{ij }|j\in {J}^{-})\)

\(\left|i=\mathrm{1,2},3,\dots m \right\}= \{{v}_{1}^{+}, {v}_{2}^{+},\dots {v}_{j}^{+},\dots {v}_{n}^{+}\}\)

(7)

\({{\varvec{A}}}^{-}=\{({{\text{min}}}_{i}{v}_{ij }| j\in J)({{\text{max}}}_{i}{v}_{ij }|j\in {J}^{-})\)

\(\left|i=\mathrm{1,2},3,\dots m \right\}= \{{v}_{1}^{-}, {v}_{2}^{-},\dots {v}_{j}^{-},\dots {v}_{n}^{-}\}\)

(8)

Step 5

In this step, separation measurement is performed by calculating the distance between each alternative in V and the ideal vector A+ or the negative ideal A by using the Euclidean distance, which is given by Eqs. (9) and (10), respectively. At the end of Step 5, two values, namely, S+ and S for each alternative, have been counted. These two values represent the distance between each alternative and both the ideal and negative ideal

\({{\varvec{S}}}_{{\varvec{i}}}^{+}= \sqrt{\sum_{j=1}^{n}{\left({v}_{ij}-{v}_{j}^{+}\right)}^{2}}\), \(i=\left\{\mathrm{1,2},3\dots m\right\}\)

(9)

\({{\varvec{S}}}_{{\varvec{i}}}^{-}= \sqrt{\sum_{j=1}^{n}{\left({v}_{ij}-{v}_{j}^{-}\right)}^{2}}\), \(i=\left\{\mathrm{1,2},3\dots m\right\}\)

(10)

Step 6

In this process, the closeness of Ai (ith alternative) to the ideal solution \({A}^{+}\) is defined, as shown in Eq. (11). \({C}_{i }^{*}=1\) if and only if \({A}_{i}= {A}^{+}\); similarly, \({C}_{i }^{*}=0\) if and only if \({A}_{i}= {A}^{-}\)

\({{\varvec{C}}}_{{\varvec{i}}\boldsymbol{ }}^{\boldsymbol{*}}={S}_{i}^{-}/\left({S}_{i}^{-}+{S}_{i}^{+}\right)\)

\(0<{C}_{i }^{*}<1, i=\left\{\mathrm{1,2},3\dots m\right\}\)

(11)

Step 7

The set of alternatives \({A}_{i}\) can now be ranked according to the descending order of \({C}_{i }^{*}\), indicating that a higher value corresponds with better performance

No equation

 

Weighted distribution

The weight of each sub-characteristic for each OSS aspect can be different (these weights are calculated in the AHP process). Therefore, the final weight of each sub-characteristic as specific to the OSS aspect is calculated

Here, \({X}_{i}\) is the final weight of a sub-characteristic for an OSS aspect, \({w}_{i}^{a}\) are the weights of OSS aspects, \({w}_{j}^{s}\) are the weights of OSS sub-characteristics, \(i\) is the number of OSS aspects (there are two OSS aspects), and \(m\) is the number of sub-characteristics

\({X}_{i}={(w}_{i}^{a}*{w}_{j}^{s})/\sum\limits_{i=1}^{n}{w}_{i}^{a}\)

\(\sum_{i=1}^{n}{w}_{i}^{a}=1\)(see Eq. (2))

\(i=\left\{1 or 2\right\}j=\left\{\mathrm{1,2},3\dots m\right\}\)

(12)

Some math equation

Some mathematical equations are used to obtain derived measures from the base measures in the concept of measurement function. For example, M1 and M2 are base measures, and M3 is a derived measure obtained from M1 and M2 using the following mathematical equation: M3 = M1/(M1 + M2). Therefore, this equation corresponds to the concept of measurement function in the OSS-QMM

It can be a different kind of equations

 

Average of the measures

In cases where multiple measures are associated with a measurable concept, these measures should be aggregated. The normalized measures (obtained in Step 2 of TOPSIS) associated with a measurable concept are averaged in this aggregation process

Here, \(p\) is the number of alternatives (OSS product), \({m}_{(k)}\) is a new value of measures associated with a measurable concept for kth alternative, \({r}_{ij}\) is a normalized measure (Step 2 of TOPSIS), and \(m, n\) are the first and the last indices of measures associated with a measurable concept, respectively

\({m}_{(k)}=\sum\limits_{j=1}^{n}{r}_{ij}/n\)

\(i =\left\{m\dots n\right\}k=\left\{\mathrm{1,2},3\dots p\right\}\)

(13)

Linear utility function

The utility functions for each OSS product can be defined to operationalize the evaluation step. The higher the evaluation value of each of these OSS products and the better it is for software quality, the higher should be the associated utility. To reflect this, simple increasing linear utility functions can be selected with two thresholds, min and max, as shown in Fig. 3

See Fig. 3

 

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Yılmaz, N., Tarhan, A.K. Quality evaluation meta-model for open-source software: multi-method validation study. Software Qual J 32, 487–541 (2024). https://doi.org/10.1007/s11219-023-09658-w

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11219-023-09658-w

Keywords