Abstract
In recent years, open-source software (OSS) has attracted increasing attention due to its easy accessibility via cloud repositories, voluntary community, no vendor lock-in, and low total cost of ownership. In turn, specifying and evaluating OSS quality has become a significant challenge for OSS adoption in organizations that are inclined to use them. Although many OSS quality models have been proposed in the literature, the dynamic and diverse nature of OSS has caused these models to be heterogeneous in terms of structure and content. This has adversely affected the standardization of evaluations and led to the evaluation results obtained from different OSS quality models for the same purpose being incomparable and sometimes unreliable. Therefore, in this study, a meta-model for OSS quality (OSS-QMM), which employs a unified structure from existing quality models and enables the derivation of homogeneous models, has been proposed. For this purpose, a systematic and laborious effort has been spent via a step-based meta-model creation process including review-and-revise iterations. In order to validate the OSS-QMM, case study and expert opinion methods have been applied to answer three research questions (RQs) targeted to investigate practical applicability, results comparability, and effectiveness of using the meta-model. Multiple and embedded case study designs have been employed for evaluating three real ERP systems, and 20 subject matter experts have been interviewed during the validation process. The results of multi-faceted empirical studies have indicated that the OSS-QMM has addressed solving problems in OSS quality evaluation and its adoption with high degrees of confidence.
Similar content being viewed by others
Data availability
The data that support the findings of this study are openly available in Google Drive and Zenodo at the following URLs:
1. Definition of the terminologies in the SQMM, Zenodo, URL: https://doi.org/10.5281/zenodo.6367596
2. List of questions to obtain feedback from experts (Step 4.4), G. Drive, URL: https://tinyurl.com/2qwowtzh
3. Expert opinion (list of questions and expert answer sheet, Step 5), G. Drive, URL: https://tinyurl.com/2ow7ayua
4. Perform case studies 2 and 3: Zenodo. https://doi.org/10.5281/zenodo.7986369
References
Adeoye-Olatunde, O. A., & Olenik, N. L. (2021). Research and scholarly methods: Semi-structured interviews. Journal of the American College of Clinical Pharmacy, 4(10), 1358–1367.
Adewumi, A., Misra, S., & Omoragbe, N. (2019). FOSSES: Framework for open- source software evaluation and selection. In: Software: Practice and Experience, 49(5), 780–812.
Adewumi, A., Misra, S., Omoragbe, N., Crawford, B., & Soto, R. (2016). A systematic literature review of open source software quality assessment models. SpringerPlus, 5(1), 1936.
Al-Dhaqm, A., Razak, S., Othman, S. H., Ngadi, A., Ahmed, M. N., & Ali Mohammed, A. (2017). Development and validation of a database forensic metamodel (DBFM). PloS One, 12(2), e0170793.
Alsolai, H., & Roper, M. (2020). A systematic literature review of machine learning techniques for software maintainability prediction. Information and Software Technology, 119, 106214.
Ardito, L., Coppola, R., Barbato, L., & Verga, D. (2020). A tool-based perspective on software code maintainability metrics: A systematic literature review. Scientific Programming, 2020.
Arthur, J. D., & Stevens, K. T. (1989). Assessing the adequacy of documentation through document quality indicators. In Proceedings. Conference on Software Maintenance, 40–49. IEEE.
Aversano, L., & Tortorella, M. (2013). Quality evaluation of floss projects: Application to ERP systems. Information and Software Technology, 55(7), 1260–1276.
Aversano, L., Guardabascio, D., & Tortorella, M. (2017). Analysis of the documentation of ERP software projects. Procedia Computer Science, 121, 423–430.
Bakar, A. D., Sultan, A. B. M., Zulzalil, H., & Din, J. (2012). Review on ‘maintainability’metrics in open source software. International Review on Computers and Software, 7(3), 903–907.
Bayer, J., & Muthig, D. (2006). A view-based approach for improving software documentation practices. 13th Annual IEEE International Symposium and Workshop on Engineering of Computer-Based Systems (ECBS’06) (p. 10). IEEE.
Beydoun, G., Low, G., Henderson-Sellers, B., Mouratidis, H., Gomez-Sanz, J. J., Pavon, J., & Gonzalez-Perez, C. (2009). FAML: A generic metamodel for MAS development. IEEE Transactions on Software Engineering, 35(6), 841–863.
Boehm, B. W., Brown, H., & Lipow, M. (1978). Quantitative evaluation of software quality. In Proceedings of the 2nd International Conference on Software Engineering, 592–605.
Briand, L., Morasca, S., & Basili, V. (2002). An operational process for goal driven definition of measures. IEEE Transactions on Software Engineering, 28(12), 1106–1125.
Brings, J., Daun, M., Keller, K., Obe, P. A., & Weyer, T. (2020). A systematic map on verification and validation of emergent behavior in software engineering research. Future Generation Computer Systems, 112, 1010–1037.
Butler, S., Gamalielsson, J., Lundell, B., Brax, C., Mattsson, A., Gustavsson, T., & Lönroth, E. (2022). Considerations and challenges for the adoption of open source components in software-intensive businesses. Journal of Systems and Software, 186, 111152.
Chakraborty, S. (2022). TOPSIS and modified TOPSIS: A comparative analysis. Decision Analytics Journal, 2, 100021.
Chawla, M. K., & Chhabra, I. (2015, October). Sqmma: Software quality model for maintainability analysis. In Proceedings of the 8th Annual ACM India Conference, 9–17
Chidamber, S. R., & Kemerer, C. F. (1994). A metrics suite for object oriented design. IEEE Transactions on Software Engineering, 20(6), 476–493.
Codemetrics. (2019). URL: https://plugins.jetbrains.com/plugin/12159-codemetrics
Dagpinar, M., & Jahnke, J. H. (2003, November). Predicting maintainability with object-oriented metrics-an empirical comparison. In 10th Working Conference on Reverse Engineering, 2003. WCRE 2003. Proceedings, 155–155. IEEE Computer Society.
Dromey, R. G. (1995). A model for software product quality. IEEE Transactions on Software Engineering, 21(2), 146–162.
Dubey, S. K., & Rana, A. (2011). Assessment of maintainability metrics for object-oriented software system. ACM SIGSOFT Software Engineering Notes, 36(5), 1–7.
Duijnhouwer, F. W., & Widdows, C. (2003). Capgemini expert letter open source maturity model. Retrieved: 30 April 2022. Capgemini. URL: tinyurl.com/yxdbvjk6
Dweiri, F., Kumar, S., Khan, S. A., & Jain, V. (2016). Designing an integrated AHP based decision support system for supplier selection in automotive industry. Expert Systems with Applications, 62, 273–283.
Eghan, E. E., Alqahtani, S. S., Forbes, C., & Rilling, J. (2019). API trustworthiness: An ontological approach for software library adoption. Software Quality Journal, 27(3), 969–1014.
Frantz, R. Z., Rehbein, M. H., Berlezi, R., & Roos-Frantz, F. (2019). Ranking open source application integration frameworks based on maintainability metrics: A review of five-year evolution. Software: Practice and Experience, 49(10), 1531–1549.
Garcia, F., Bertoa, M. F., Calero, C., Vallecillo, A., Ruiz, F., Piattini, M., & Genero, M. (2006). Towards a consistent terminology for software measurement. Information and Software Technology, 48(8), 631–644.
Gezici, B., Özdemir, N., Yılmaz, N., Coşkun, E., Tarhan, A., & Chouseinoglou, O. (2019). Quality and success in open source software: A systematic mapping. 45th Euromicro Conference on Software Engineering and Advanced Applications (SEAA), 363–370. IEEE.
Goeb, A. (2013). A meta model for software architecture conformance and quality assessment. Electronic Communications of the EASST, 60.
Grady, R. B. (1992). Practical software metrics for project management and process improvement. Prentice Hall.
Hanefi Calp, M., Arici, N., Enstitüsü, B., Üniversitesi, G., & Ankara, T. (2011). Nesne Yönelimli Tasarım Metrikleri ve Kalite Özellikleriyle İlişkisi. Politeknik Dergisi Journal of Polytechnic Cilt Digital Object Identifier, 14141(10), 9–14.
Hanine, M., Boutkhoum, O., Tikniouine, A., & Agouti, T. (2016). Application of an integrated multi-criteria decision making AHP-TOPSIS methodology for ETL software selection. Springerplus, 5(1), 1–17.
Hasnain, S., Ali, M. K., Akhter, J., Ahmed, B., & Abbas, N. (2020). Selection of an industrial boiler for a soda-ash production plant using analytical hierarchy process and TOPSIS approaches. Case Studies in Thermal Engineering, 19, 100636.
Hauge, O., Osterlie, T., & Sorensen, C. F. (2009) An empirical study on selection of open source software-preliminary results. In: ICSE Workshop on Emerging Trends in Free/Libre/Open Source Software Research and Development. IEEE.
Hmood, A., Keivanloo, I., & Rilling, J. (2012, July). SE-EQUAM-an evolvable quality metamodel. In 2012 IEEE 36th Annual Computer Software and Applications Conference Workshops, 334–339. IEEE
Ho, W., & Ma, X. (2018). The state-of-the-art integrations and applications of the analytic hierarchy process. European Journal of Operational Research, 267(2), 399–414.
IEEE standard glossary of software engineering terminology. (1990). IEEE Standart 610.12-1990. pp. 1–84.
IEEE Standard for a Software Quality Metrics Methodology. (1998). In IEEE Standart 1061–1998.
Işıklar, G., & Büyüközkan, G. (2007). Using a multi-criteria decision making approach to evaluate mobile phone alternatives. Computer Standards & Interfaces, 29(2), 265–274.
ISO, International Standard ISO VIM. (1993). International vocabulary of basic and general terms in metrology, International Standards Organization, Geneva, Switzerland, second edition.
ISO/IEC 14598-3:1999. (1999). Information technology-software product evaluation-Part 3: Process for developers. International Organization for Standardization, Geneva.
ISO/IEC 15939:2007. (2007). Information Technology—Software Engineering—Software Measurement Process. International Organization for Standardization, Geneva.
ISO/IEC 9126-1:2001. (2001). Software engineering - Product quality -Part 1: Quality model, international organization for standardization, Geneva, Switzerland.
Jha, S., Kumar, R., Abdel-Basset, M., Priyadarshini, I., Sharma, R., & Long, H. V. (2019). Deep learning approach for software maintainability metrics prediction. Ieee Access, 7, 61840–61855.
Jiang, S., Cao, J., & Qi, Q. (2021). Exploring development-related factors affecting the popularity of open source software projects. In 2021 IEEE 24th International Conference on Computer Supported Cooperative Work in Design (CSCWD), 244–249. IEEE.
Joshi, A., Kale, S., Chandel, S., & Pal, D. K. (2015). Likert scale: Explored and explained. British Journal of Applied Science & Technology, 7(4), 396.
Khashei-Siuki, A., & Sharifan, H. (2020). Comparison of AHP and FAHP methods in determining suitable areas for drinking water harvesting in Birjand aquifer. Iran. Groundwater for Sustainable Development, 10, 100328.
Khatri, S. K., & Singh, I. (2016). Evaluation of open source software and improving its quality. 5th International Conference on Reliability, Infocom Technologies and Optimization (ICRITO). IEEE.
Kim, H. M. (1999). Representing and reasoning about quality using enterprise models. PhD thesis, Dept. Mechanical and Industrial Engineering, University of Toronto, Canada.
Kitchenham, B., Hughes, R. T., & Linkman, S. G. (2001). Modeling software measurement data. IEEE Transactions on Software Engineering, 27(9), 788–804.
Kläs, M., Lampasona, C., Nunnenmacher, S., Wagner, S., Herrmannsdörfer, M., & Lochmann, K. (2010). How to evaluate meta-models for software quality. In Proceedings of the 20th International Workshop on Software Measurement.
Lenarduzzi, V., Taibi, D., Tosi, D., Lavazza, L., & Morasca, S. (2020). Open source software evaluation, selection, and adoption: A systematic literature review. In: 46th Euromicro Conference on Software Engineering and Advanced Applications (SEAA). IEEE
Li, J., Conradi, R., Bunse, C., Torchiano, M., Slyngstad, O. P. N., & Morisio, M. (2009). Development with off-the-shelf components: 10 facts. IEEE Software, 26(2), 80–87.
List of questions and expert opinion (Step-5). (2023). Retrieved: 20 June 2023. URL: https://tinyurl.com/2ow7ayua
Magaldi, D., & Berler, M. (2020). Semi-structured interviews. Encyclopedia of Personality and Individual Differences, 4825–4830.
Mc Call, J. A., Richards, P. K., & Walters, G. F. (1977). Factors in software quality, volumes I, II, and III. US Rome Air Development Center Reports, US Department of Commerce, USA.
Mens, T., Doctors, L., Habra, N., Vanderose, B., & Kamseu, F. (2011). Qualgen: Modeling and analysing the quality of evolving software systems. In: 15th European Conference on Software Maintenance and Reengineering. IEEE.
MetricsReloaded. (2004). URL: https://plugins.jetbrains.com/plugin/93-metricsreloaded
Nistala, P., Nori, K. V., & Reddy, R. (2019). Software quality models: A systematic mapping study. International Conference on Software and System Processes (ICSSP), 125–134. IEEE.
Object Management Group (OMG). (2019). Meta Object Facility (MOF). Core specification version 2.5.1. Retrieved: 2 October 2022. URL: https://www.omg.org/spec/MOF/2.5.1/PDF
Othman, S. H., & Beydoun, G. (2010). Metamodelling approach to support disaster management knowledge sharing. In: 21st Australasian Conference on Information Systems.
Othman, S. H., Beydoun, G., & Sugumaran, V. (2014). Development and validation of a disaster management metamodel (DMM). Information Processing & Management, 50(2), 235–271.
Samoladas, I., Goussios, G., & Spinellis, D. (2008). The SQO-OSS quality model: measurement based open source software evaluation. In: IFIP International Conference on Open Source Systems. Springer, Boston, MA.
Saaty, T. L. (1980). The analytic hierarchy process: Planning, priority setting and resource allocation. McGraw-Hill.
Saaty, T. L. (2008). Decision making with the analytic hierarchy process. International Journal of Services Sciences, 1(1), 83–98.
Saaty, T. L., & Sagir, M. (2015). Ranking countries more reliably in the summer olympics. International Journal of the Analytic Hierarchy Process, 7(3), 589–610.
Salem, I. E. B. (2015). Transformational leadership: Relationship to job stress and job burnout in five-star hotels. Tourism and Hospitality Research, 15(4), 240–253.
SciTools Understand. (2020). URL. https://scitools.com/
Semeteys, R. (2006). Method for qualification and selection of open source software (QSOS), version 1.6. Retrieved: 30 April 2022. URL: tinyurl.com/y2phllex
Silva, D. G., Coutinho, C., & Costa, C. J. (2023). Factors influencing free and open-source software adoption in developing countries—an empirical study. Journal of Open Innovation: Technology, Market, and Complexity, 9(1), 21–33.
Sjoberg, G., Orum, A. M., & Feagin, J. R. (2020). A case for the case study. The University of North Carolina Press.
Soto, M., & Ciolkowski, M. (2009). The QualOSS open source assessment model measuring the performance of open source communities. In: 3rd International Symposium on Empirical Software Engineering and Measurement. IEEE.
Spinellis, D., & Jureczko, M. (May 2011). Metric Description [Online] Available: http://gromit.iiar.pwr.wroc.pl/p_inf/ckjm/
Swedberg, R. (2020). Exploratory research. The Production of Knowledge: Enhancing Progress in Social Science, 17–41.
Tanrıöver, Ö. Ö., & Bilgen, S. (2011). A framework for reviewing domain specific conceptual models. Computer Standards & Interfaces, 33(5), 448–464.
Tassone, J., Xu, S., Wang, C., Chen, J., & Du, W. (2018) Quality assessment of open source software: A review. IEEE/ACIS 17th International Conference on Computer and Information Science (ICIS), 411–416. IEEE.
Vanderose, B., Habra, N., & Kamseu, F. (2010). Towards a model-centric quality assessment. In Proceedings of the 20th International Workshop on Software Measurement (IWSM 2010): Conference on Software Process and Product Measurement (Stuttgart Nov 2010).
Visconti, M., & Cook, C. R. (2002) An overview of industrial software documentation practice. In 12th International Conference of the Chilean Computer Science Society, 2002. Proceedings, 179–186. IEEE.
Wagner, S., Goeb, A., Heinemann, L., Kläs, M., Lampasona, C., Lochmann, K., Mayr, A., Plösch, R., Seidl, A., Streit, J., & Trendowicz, A. (2015). Operationalised product quality models and assessment: The Quamoco approach. Information and Software Technology, 62, 101–123.
Wasserman, M. P., & Chan, C. (2006). Business readiness rating project, BRR Whitepaper RFC 1. URL: tinyurl.com/y5srd5sq
Wasserman, A. I., Guo, X., McMillian, B., Qian, K., Wei, M. Y., & Xu, Q. (2017). OSSpal: Finding and evaluating open source software. In Open Source Systems: Towards Robust Practices: 13th IFIP WG 2.13 International Conference.
Wohlin, C. (2021). Case study research in software engineering—it is a case, and it is a study, but is it a case study? Information and Software Technology, 133, 106514.
Wohlin, C., Runeson, P., Höst, M., Ohlsson, M. C., Regnell, B., & Wesslén, A. (2012). Experimentation in software engineering: An introduction. Springer.
Yalcin, A. S., Kilic, H. S., & Delen, D. (2022). The use of multi-criteria decision-making methods in business analytics: A comprehensive literature review. Technological Forecasting and Social Change, 174, 121193.
Yin, R. K. (2018). The case study research and applications. Sage.
Yılmaz, N., & Tarhan, A. K. (2020). Meta-models for software quality and its evaluation: A systematic literature review. In: International Workshop on Software Measurement and the 15th International Conference on Software Process and Product Measurement, Mexico.
Yılmaz, N., & Tarhan, A. K. (2022a). Quality evaluation models or frameworks for open source software: A systematic literature review. Journal of Software: Evolution and Process, 34(6), e2458. https://doi.org/10.1002/smr.2458
Yılmaz, N., & Tarhan, A. K. (2022b). Matching terms of quality models and meta-models: Toward a unified meta-model of OSS quality. Software Quality Journal, 1–53. https://doi.org/10.1007/s11219-022-09603-3
Yilmaz, N., & Tarhan, A. K. (2022c). Definition of the term used in the SQMM. Zenodo. https://doi.org/10.5281/zenodo.6367596
Yilmaz, N., & Tarhan, A. K. (2023). Supplementary document of the article titled ‘quality evaluation meta-model for open source software. Zenodo. https://doi.org/10.5281/zenodo.7986369
Zhao, Y., Liang, R., Chen, X., & Zou, J. (2021). Evaluation indicators for open-source software: A review. Cybersecurity, 4(1), 1–24.
Funding
No funding was obtained for this study.
Author information
Authors and Affiliations
Contributions
NY and AKT conceived the presented idea and designed the empirical studies of validation. NY carried out the empirical studies, discussed the results with AKT, and wrote the manuscript. AKT reviewed the manuscript in several iterations and suggested revisions as necessary.
Corresponding authors
Ethics declarations
Conflict of interest
The authors declare no competing interests.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendices
Appendix 1. Final version of OSS-QMM (please refer to Yılmaz & Tarhan, 2022b for detailed explanations of the concepts and relationships)
Appendix 2. New operationalized quality model derived from OSS-QMM
Appendix 3. Mapping of the terms in the existing OSS quality models (OSMM, OpenBRR, OSSpal, and SQO-OSS) to the concepts of the OSS-QMM
OSS-QMM concepts | Terms in OSS quality models | ||||
---|---|---|---|---|---|
Quality model | OSMM | OpenBRR | OSSpal | SQO-OSS | |
Viewpoint | Developer | Developer | Developer | Developer | |
OSS aspect | Community-based | Community-based | Community-based | Code-based | Community-based |
Information need | Calculation of developer size to evaluate maintainability | Calculation of developer productivity to evaluate maintainability | Calculation of consulting services quality to evaluate maintainability | Calculation of comment frequency to evaluate maintainability | Calculation of documentation quality to evaluate maintainability |
Characteristic | Maintainability | Maintainability | Maintainability | Maintainability | Maintainability |
Sub-characteristic | Acceptance | Product quality | Support and service | Analyzability | Analyzability |
Entity | Developer | Contributor | Contributor | Source code | Contributor |
Quality requirement | The large size of the developer is desirable for maintainability | Productive developers are desirable for maintainability | The active consulting service is desirable for maintainability | The high comment frequency is desirable for maintainability | The large number of documents is desirable for maintainability |
Impact | Positive | Positive | Positive | Positive | Positive |
Measurable concepts | The size of the developer | Productivity of contributors | The activeness of consulting community | Complexity of source code | Completeness of documentation |
Measure | Number of developers (base measure) | Number of releases (base measure) | Number of the consulting community (base measure) | Weighted method per class (WMC) (base measure) | Number of documents (base measure) |
Unit | Developer | Release | Consulting community | Methods | Documents |
Scale | Integer from zero to five (the score (1–5) is assigned w.r.t. rules given in OSMM) | Integer from zero to three (the score (1–3) is assigned w.r.t. rules given in OpenBRR) | Integer from zero to three (the score (1–5) is assigned w.r.t. rules given in OSSpal) | Integer from zero to infinity | Integer from zero to infinity |
Measurement method | Manually | Manually | Manually | Automatically (e.g., Understand Scitool, CKJM, and Intellij IDEA) | Manually |
Measurement function | There is no measurement function because it is a base measure | There is no measurement function because it is a base measure | There is no measurement function because it is a base measure | There is no measurement function because it is a base measure | There is no measurement function because it is a base measure |
Appendix 4. Design details about case studies: list of selected OSS products, code-based measures, community-based measures, and steps of integrated AHP-TOPSIS method
(a) List of selected OSS products used in the case studies
Product Properties | Apache OFBiz | Adempiere | Compiere |
---|---|---|---|
Website | |||
Product type | Open-source ERP system | Open-source ERP system | Open-source ERP system |
Programming language | Java (1.547.623 LOC) | Java (1.973.229 LOC) | Java (1.402.191 LOC) |
First release date | 2009 | 2006 | 2000 |
(b) List of code-based measures with their description and measurable concepts associated with each measure (Bakar et al., 2012; Chawla & Chhabra, 2015; Dagpinar & Jahnke, 2003; Dubey & Rana, 2011)
Measurable concept (MC) | Measure | Description |
---|---|---|
MC1: complexity of source code | WMC: weighted methods per class | The degree of complexity and the number of methods in a class (Hanefi Calp et al., 2011). With the increasing number of methods, the code analyzability time will automatically increase (Chidamber & Kemerer, 1994) |
CC: cyclomatic complexity | Measures the ratio of the flow of the program source code to follow independent paths from one another and is directly related to the complexity of the code. The high value of this metric is undesirable and will affect the source code analyzability | |
NNL: number of nested levels | Measures the depth of nesting of the loops in a class, and a higher value of this metric reduces the testability and stability | |
MC2: comment frequency of source code | NOS: number of statements | Measure the frequency of comments and explanations that will show us the way to reduce the complexity of software. It also facilitates the tracking and resolvability of the program |
MC3: inheritance complexity degree of source code | DIT: depth of inheritance tree | Measures the distance of a class to the root of the inheritance tree (Chidamber & Kemerer, 1994). The high depth of the tree increases the complexity since it includes more classes and methods, indicating low changeability and the stability of the software product |
NOC: number of children | Measures the number of lower classes derived from a class. When the value of this metric is high, it indicates that the value of re-use is higher, more errors may occur, and a higher effort is required during testing (Chidamber & Kemerer, 1994) | |
MC4: interaction complexity (coupling) degree of source code | CBO: coupling between object classes | Represents the number of classes coupled to a given class. This dependency is a dependency when some properties or methods in the class are used in other classes without inheritance between classes (Chidamber & Kemerer, 1994). High levels of dependence between classes harm the modular design and reduce changeability |
RFC: response for a class | Measure the number of all the methods that can be triggered when calling methods of an object from one class to this object. Namely, the total number of written in a class and method called (Chidamber & Kemerer, 1994). Software products with a lower RFC metric value can be better understood and tested | |
MC5: cohesion degree of source code | LCOM: lack of cohesion of methods | Measures the degree of similarity of methods with each other (Chidamber & Kemerer, 1994). Therefore, it is desirable to have low values of the metric |
(c) List of the community-based measures with their equation and measurable concepts associated with each measure
Measurable concept (MC) | Measure | Measurement functions (equation) |
---|---|---|
MC6: difficulty degree of bug | *BSI: bug severity index | \(\left(\frac{\#\;\mathrm{ of\; blocker}}{{\text{LOC}}}\times 9\right)+\left(\frac{\#\;\mathrm{ of\; critical}}{{\text{LOC}}}\times 7\right)+\left(\frac{\#\;\mathrm{ of\; major}}{{\text{LOC}}}\times 5\right)+\left(\frac{\#\;\mathrm{ of\; minor}}{{\text{LOC}}}\times 3\right)+\left(\frac{\#\;\mathrm{ of\; trivial}}{{\text{LOC}}}\times 1\right)\) |
MC7: completeness of documentation | ND: number of documents | No equation (it is a base measure) |
MC8: the activeness of the community | *CD: commit density | \(\frac{(\#\;\mathrm{of\;commit})/(\#\;\mathrm{of\;developer})}{({\text{kLOC}})}\) |
*ED: email density | \(\frac{(\#\;\mathrm{ of \;email})/(\#\;\mathrm{ of\; developer})}{({\text{LOC}})/(\#\;\mathrm{ of\; release})}\) | |
MC9: size of the development community | NC: number of contributors | No equation (it is a base measure) |
MC10: performance of contributor | *FRIS: feature request implementation success | \(\frac{(\#\;\mathrm{ of \;closed \;feature\; request})/(\#\;\mathrm{ of\; total\; feature\; request})}{({\text{kLOC}})}\) |
*BSSR: bug-solving success rate | \(\frac{(\#\;\mathrm{ of \;closed \;bug})/(\#\;\mathrm{ of \;total \;bug})}{({\text{kLOC}})}\) | |
MC11: productivity of contributors | NR: number of releases | No equation (it is a base measure) |
MC12: fault proneness of the contributor | *DD: defect density | \(\frac{\#\;\mathrm{ of\; total\; defect}}{({\text{LOC}})}\) |
MC13: maturity of project | PA: product age | No equation (it is a base measure) |
(d) Integrated AHP-TOPSIS method used for quality evaluation in the case studies
Appendix 5. List of community-based measures and their descriptions
Measure | Description |
---|---|
Bug severity index (BSI) | Bug severity is a classification of software defect (bug) to indicate the degree of negative impact on the quality of software. It generally consists of the following five levels, from most important to least important: blocker, critical, major, minor, and trivial. The bug reporting database of OSS products is investigated to calculate the severity of bugs. Since the impact of each level of bugs on the product will be different, bugs are weighted according to their severity as follows: blocker = 9, critical = 7, major = 5, minor = 3, and trivial = 1. Accordingly, the number of bugs in each severity level, lines of code (LOC), and the weights given according to the bug severity are used as base measures to calculate BSI via the equation in Appendix 4(c). That is, the number of bugs in each severity level is divided by the size of the product, and the results are multiplied by the weight of each level. Then, the results obtained for each severity level are summed up |
Number of document (ND) | Software documentation plays a very important role in all the phases of a software system’s life cycle. In the context of ERP, the relevance of the software documentation is even more important due to the complexity of such a kind of software systems and the strategic role they have within operative organizations. It is not only important from the viewpoint of software engineering but also from the viewpoint of a user. The user needs to know how to use and install an OSS system, import and manage data, etc. In this context, some documents (e.g., a user guide, a technical guide, a database installation guide, a developer guide, API documentation, and Wiki pages) can be available for OSS systems in different formats (e.g., PDF and HTML). To access these documents, the websites of three open-source ERP systems and their different cloud repositories (e.g., SourceForge and GitHub) were visited, and the accessible number of documents was collected. Since ND is already a base measure, there is no need for an equation as indicated in Appendix 4(c) |
Commit density (CD) | A commit is an operation that sends the latest changes (e.g., added lines or removed lines) of the source code to the repository. Every change to the source code of an OSS system has a purpose, e.g., to adapt, correct, perfect, or extend the system. After the changes are performed, these commits are stored in a version control system, such as a concurrent version system (CVS) or Git. In this context, the high number of commits may provide information about the activeness of the developers of the product and the high potential to develop with changes. The number of commits, number of developers, and kilo lines of code (kLOC) are used as base measures to calculate commit density via the equation in Appendix 4(c). The high number of developers and product size (i.e., kLOC) are likely to result in an increase in the number of commits. Therefore, these base measures are used to calculate commit density as specified in the equation in Appendix 4(c) |
email density (ED) | The mailing lists of the OSS are store stored monthly in the CVS archive and anyone with an interest in development can join the mailing lists. In the case study, we considered the mailing list among developers. It contains different sorts of messages, including technical discussions, proposed changes, automatic notification messages about changes in the code, and problem reports. The number of emails, number of developers, lines of code (LOC), and the number of versions (releases) are used as base measures to calculate email density via the equation in Appendix 4(c). As each version introduces new features, proposed changes and technical discussions among developers will increase, which will trigger an increase in email traffic. The number of developers is also important in calculating this measure, as the number of emails among developers is taken into account. Also, the size of the product (i.e., LOC) is highly likely to be directly proportional to the number of emails. That is, as the size of the product increases, the number of emails is expected to increase as the problems related to the product will also increase. Therefore, these base measures are used to calculate ED as specified in the equation in Appendix 4(c) |
Number of contributor (NC) | The saying “many hands make light work” certainly holds true when dealing with an open-source product. If people’s ambitions and interests change, it often causes that person moving onto doing other things. Also, the increase in the number of contributors to the OSS product results in the heterogeneity of the community. The heterogeneity of the community contributes to the quality of OSS products. For example, if the contributors are employees of a small company, there is the risk of the company cutting its support. As a result, the greater the group of contributors the less chance that the OSS product development stalls. Since NC is already a base measure, there is no need for an equation as indicated in Appendix 4(c) |
Feature request implementation success (FRIS) | Feedback from OSS users or developers constitutes a vital part of the evolution of OSS projects. In this context, issue tracking systems (ITS) such as Bugzilla serve to request new features or enhancements to the OSS. User or developer express their demand for further development of the OSS product as a feature request in ITS. At this point, developers are required to implement these incoming feature requests to evolve OSS products. Therefore, the success in the implementation of these feature requests is important for the quality of OSS. This measure is directly related to the performance of the developers with respect to feature request implementation success. The number of closed (implemented) feature requests, number of total feature requests, and kilo lines of code (kLOC) are used as base measures to calculate FRIS via the equation in Appendix 4(c) |
Bug-solving success rate (BSSR) | A software bug is an error, flaw, failure, or fault in a computer program or system, which causes it to produce an incorrect or unexpected result or to behave in unintended ways. Some issue tracking systems (e.g., Bugzilla, Trac, or OTRS (open-source ticket requesting system)) are used to bug reports of the OSS products. Teams who are more capable or disciplined in handling incoming bugs are generally considered more successful. That is, this measure is directly related to the performance of the developers with respect to bug-solving success. The number of closed (solved) bugs, number of total bugs, and kilo lines of code (kLOC) are used as base measures to calculate BSSR via the equation in Appendix 4(c) |
Number of releases (NR) | A software release is a change or set of changes that are created to be delivered to the end-user after further new properties, or enhancements. As expectations from software are constantly changing, many versions of the software may be released over its lifetime. The addition of new properties to the software with new versions and the enhancement of the software can be considered in relation to the productivity of the developers. Despite the fact that the type of release is classified as minor, major, and emergency release, the total number of releases is considered in this case study. Since NR is already a base measure, there is no need for an equation as indicated in Appendix 4(c) |
Defect density (DD) | Defect density is the number of defects detected in a software component during a defined period of development operation divided by the size of the software component. Minimizing defect density is important for establishing the time, cost, and quality balance that is critical to the quality of OSS projects. The number of total defects and lines of code (LOC) is used as base measures to calculate DD via the equation in Appendix 4(c) |
Product age (PA) | The age of a product is the time from when the product was created to the present. The longer a product remains under active development, the smaller the chance the product’s development will suddenly stop. In this sense, the first year is the biggest challenge and hurdle for open-source initiatives. The initiative is often halted due to the small community that cannot sustain the workload that the product generates. As the developer will not be subject to any financial compensation, it should attract new developers for the OSS product to live for many years. In this context, the longevity of the product is important for product quality. Since PA is already a base measure, there is no need for an equation as indicated in Appendix 4(c) |
Appendix 6. Description of evaluation methods with their formulae used in the case studies
Technique | Description | Equation | |
---|---|---|---|
AHP | The AHP method consists of the following steps. Please see (Ho & Ma, 2018; Saaty, 1980, 2008) for details | ||
Step 1 | Firstly, structural hierarchies are created. The concepts of OSS aspect and quality characteristics provide this condition | No equation | |
Step 2 | A pair-wise comparison matrix A (size nxn) is constructed to compare the criteria in pairs. Each OSS aspect and related sub-characteristics are as “criteria.” | \({A=[{x}_{ij}]}_{nxn}\) Matrix A is a pair-wise comparison matrix | |
Step 3 | Pair-wise comparisons are performed by comparing the relative importance of two selected criteria. The matrix A is filled by using the scales 1–9, as proposed by Saaty (Saaty, 1980) (see (Ho & Ma, 2018) for details) | A pairwise comparison is performed on matrix A, and the matrix is filled out | |
Step 4 | The matrix A is normalized, and normalized pairwise comparison decision matrix Anorm matrix is obtained. In this formula, each element of matrix A in a column is divided by the sum of the elements in the same column | \({{A}_{{\text{norm}}}=[{a}_{ij}]}_{nxn}=\nicefrac{{x}_{ij}}{\sum_{i=1}^{n}{x}_{ij}}\) | (1) |
Step 5 | The final weight of each criterion is calculated | \({w}_{i}=\nicefrac{\sum_{j=1}^{n}{a}_{ij}}{n}\) and \(\sum_{i=1}^{n}{w}_{i}\) =1 \(i, j=\mathrm{1,2},\dots n\) | (2) |
Step 6 | The consistency ratio (CR) is calculated to check the consistency of the decision-maker’s judgment. Firstly, the consistency index (CI) is calculated, where λmax is the eigenvalue (see (Saaty, 2008; Ho & Ma, 2018) for details) corresponding to the matrix of pair-wise comparisons and n is the number of criteria being compared. Then, CR is calculated. Here, random index (RI) is a value that depends on the number of criteria (n) (see (Saaty, 2008; Ho & Ma, 2018) for values of RI according to n) | \({\varvec{C}}{\varvec{I}}=\boldsymbol{ }\left({\lambda }_{{\text{max}}}-n\right)/(n-1)\) | (3) |
\({\varvec{C}}{\varvec{R}}= CI/RI\) | (4) | ||
Step 7 | The final weight of each criterion is approved | No equation | |
TOPSIS | The final weight of each criterion obtained from the AHP method is used as input to the TOPSIS method. The TOPSIS method consists of the following steps (please see (Hasnain et al., 2020; Chakraborty, 2022; Işıklar & Büyüközkan, 2007) for details) | ||
Step 1 | Firstly, decision matrix B = [bij]mxn, where m is alternatives (i.e., OSS products) in the rows and n is evaluation criteria (i.e., measurable concepts) in the columns, is constructed | \({B=[{b}_{ij}]}_{mxn}\) Matrix B is the decision matrix | |
Step 2 | Normalized decision matrix R = [rij]mxn is constructed | \({\varvec{R}}{=[{r}_{ij}]}_{mxn}={ b}_{ij}/\sqrt{\sum\limits_{{\varvec{i}}=1}^{{\varvec{m}}}{b}_{ij}^{2}}\) \(i=\mathrm{1,2},3\dots m;\) and \(j=\mathrm{1,2},3\dots n\) | (5) |
Step 3 | The final weights obtained from the AHP method are multiplied by the values of the normalized decision matrix R. Thus, the weighted normalized decision matrix V = [vij]mxn is obtained | \({{\varvec{V}}=\left[{v}_{ij}\right]}_{mxn}={w}_{j} x {r}_{ij}\) \(i=\mathrm{1,2},3\dots m;\) and \(j=\mathrm{1,2},3\dots n\) | (6) |
Step 4 | In this step, two artificial alternatives, A+ (the positive ideal solution) and A− (the negative ideal solution), are defined by Eqs. (7) and (8), respectively Here, J is the subset of {I = 1, 2, …, m}, which presents the concept of impact (positive impact) in the OSS-QMM, and J− is the complement set of J | \({{\varvec{A}}}^{+}=\{({{\text{max}}}_{i}{v}_{ij }| j\in J)({{\text{min}}}_{i}{v}_{ij }|j\in {J}^{-})\) \(\left|i=\mathrm{1,2},3,\dots m \right\}= \{{v}_{1}^{+}, {v}_{2}^{+},\dots {v}_{j}^{+},\dots {v}_{n}^{+}\}\) | (7) |
\({{\varvec{A}}}^{-}=\{({{\text{min}}}_{i}{v}_{ij }| j\in J)({{\text{max}}}_{i}{v}_{ij }|j\in {J}^{-})\) \(\left|i=\mathrm{1,2},3,\dots m \right\}= \{{v}_{1}^{-}, {v}_{2}^{-},\dots {v}_{j}^{-},\dots {v}_{n}^{-}\}\) | (8) | ||
Step 5 | In this step, separation measurement is performed by calculating the distance between each alternative in V and the ideal vector A+ or the negative ideal A− by using the Euclidean distance, which is given by Eqs. (9) and (10), respectively. At the end of Step 5, two values, namely, S+ and S− for each alternative, have been counted. These two values represent the distance between each alternative and both the ideal and negative ideal | \({{\varvec{S}}}_{{\varvec{i}}}^{+}= \sqrt{\sum_{j=1}^{n}{\left({v}_{ij}-{v}_{j}^{+}\right)}^{2}}\), \(i=\left\{\mathrm{1,2},3\dots m\right\}\) | (9) |
\({{\varvec{S}}}_{{\varvec{i}}}^{-}= \sqrt{\sum_{j=1}^{n}{\left({v}_{ij}-{v}_{j}^{-}\right)}^{2}}\), \(i=\left\{\mathrm{1,2},3\dots m\right\}\) | (10) | ||
Step 6 | In this process, the closeness of Ai (ith alternative) to the ideal solution \({A}^{+}\) is defined, as shown in Eq. (11). \({C}_{i }^{*}=1\) if and only if \({A}_{i}= {A}^{+}\); similarly, \({C}_{i }^{*}=0\) if and only if \({A}_{i}= {A}^{-}\) | \({{\varvec{C}}}_{{\varvec{i}}\boldsymbol{ }}^{\boldsymbol{*}}={S}_{i}^{-}/\left({S}_{i}^{-}+{S}_{i}^{+}\right)\) \(0<{C}_{i }^{*}<1, i=\left\{\mathrm{1,2},3\dots m\right\}\) | (11) |
Step 7 | The set of alternatives \({A}_{i}\) can now be ranked according to the descending order of \({C}_{i }^{*}\), indicating that a higher value corresponds with better performance | No equation | |
Weighted distribution | The weight of each sub-characteristic for each OSS aspect can be different (these weights are calculated in the AHP process). Therefore, the final weight of each sub-characteristic as specific to the OSS aspect is calculated Here, \({X}_{i}\) is the final weight of a sub-characteristic for an OSS aspect, \({w}_{i}^{a}\) are the weights of OSS aspects, \({w}_{j}^{s}\) are the weights of OSS sub-characteristics, \(i\) is the number of OSS aspects (there are two OSS aspects), and \(m\) is the number of sub-characteristics | \({X}_{i}={(w}_{i}^{a}*{w}_{j}^{s})/\sum\limits_{i=1}^{n}{w}_{i}^{a}\) \(\sum_{i=1}^{n}{w}_{i}^{a}=1\)(see Eq. (2)) \(i=\left\{1 or 2\right\}j=\left\{\mathrm{1,2},3\dots m\right\}\) | (12) |
Some math equation | Some mathematical equations are used to obtain derived measures from the base measures in the concept of measurement function. For example, M1 and M2 are base measures, and M3 is a derived measure obtained from M1 and M2 using the following mathematical equation: M3 = M1/(M1 + M2). Therefore, this equation corresponds to the concept of measurement function in the OSS-QMM | It can be a different kind of equations | |
Average of the measures | In cases where multiple measures are associated with a measurable concept, these measures should be aggregated. The normalized measures (obtained in Step 2 of TOPSIS) associated with a measurable concept are averaged in this aggregation process Here, \(p\) is the number of alternatives (OSS product), \({m}_{(k)}\) is a new value of measures associated with a measurable concept for kth alternative, \({r}_{ij}\) is a normalized measure (Step 2 of TOPSIS), and \(m, n\) are the first and the last indices of measures associated with a measurable concept, respectively | \({m}_{(k)}=\sum\limits_{j=1}^{n}{r}_{ij}/n\) \(i =\left\{m\dots n\right\}k=\left\{\mathrm{1,2},3\dots p\right\}\) | (13) |
Linear utility function | The utility functions for each OSS product can be defined to operationalize the evaluation step. The higher the evaluation value of each of these OSS products and the better it is for software quality, the higher should be the associated utility. To reflect this, simple increasing linear utility functions can be selected with two thresholds, min and max, as shown in Fig. 3 | See Fig. 3 |
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Yılmaz, N., Tarhan, A.K. Quality evaluation meta-model for open-source software: multi-method validation study. Software Qual J 32, 487–541 (2024). https://doi.org/10.1007/s11219-023-09658-w
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11219-023-09658-w