Skip to main content
Log in

Benchmarking quality measurement

  • Published:
Software Quality Journal Aims and scope Submit manuscript

Abstract

This paper gives a simple benchmarking procedure for companies wishing to develop measures for software quality attributes of software artefacts. The procedure does not require that a proposed measure is a consistent measure of a quality attribute. It requires only that the measure shows agreement most of the time. The procedure provides summary statistics for measures of quality attributes of a software artefact. These statistics can be used to benchmark subjective direct measurement of a quality attribute by a company’s software developers. Each proposed measure is expressed as a set of error rates for measurement on an ordinal scale and these error rates enable simple benchmarking statistics to be derived. The statistics can also be derived for any proposed objective indirect measure or prediction system for the quality attribute. For an objective measure or prediction system to be of value to the company it must be ‘better’ or ‘more objective’ than the organisation’s current measurement or prediction capability; and thus confidence that the benchmark’s objectivity has been surpassed must be demonstrated. By using Bayesian statistical inference, the paper shows how to decide whether a new measure should be considered ‘more objective’ or whether a prediction system’s predictive capability can be considered ‘better’ than the current benchmark. Furthermore, the Bayesian inferential approach is easy to use and provides clear advantages for quantifying and inferring differences in objectivity.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Altman, D. G. (1999). Practical statistics for medical research. London: Chapman and Hall/CRC.

    Google Scholar 

  • Coleman, D., Ash, D., Lowther, D., & Oman. P. (1994). Using metrics to evaluate software systems maintainability. IEEE Computer, August 44–49.

  • Conte, S. D., Dunsmore, H. D., & Shen, V. Y. (1986). Software engineering metrics and models. Menlo Park, CA: Benjamin-Cummings.

    Google Scholar 

  • CMMI. (2001). Team capability maturity model integration for systems engineering, software engineering, integrated product and process development, and supplier sourcing (CMMIS-SE/SW/IPPD/SS), version 1.1. Continuous Representation, Software Engineering Institute, Carnegie-Mellon University, Pittsburgh, December.

  • Dawid, A. P., & Skene, A. M. (1979). Maximum likelihood estimation of observer error-rates using the EM algorithm. Applied Statistics, 28(1), 20–28.

    Article  Google Scholar 

  • Fenton, N. (1994). Software measurement: A necessary scientific basis.IEEE Transactions on Software Engineering, 20(3), 199–206.

    Google Scholar 

  • Fenton, N. E., Krause, P., & Neil M. (2002). Software measurement: Uncertainty and causal modelling. IEEE Software, 10(4), 116–122.

    Article  Google Scholar 

  • Fenton, N. E., & Neil, M. (1998). A strategy for improving safety related software engineering standards. IEEE Transactions on Software Engineering, 24(11), 1002–1013.

    Google Scholar 

  • Fenton, N. E., & Pfleeger, S. L. (1997). Software metrics: A rigorous and practical approach (2nd ed.). PWS Publishing.

  • Gelman, A., Carlin, J. B., Stern, H. S., & Rubin, D. B. (1998). Bayesian data analysis. Chapman & Hall.

  • ISO/IEC 9126-1:2001, Software engineering – Product quality – Part 1: Quality model, International Standardisation.

  • Khoshgoftaar, T. M., Seliya, N., & Gao, K. (2005). Assessment of a new three-group software quality classification technique: An empirical case study. Journal of Empirical Software Engineering, 10(3), 183–218.

    Article  Google Scholar 

  • Kitchenham, B., Pfleeger, S. L., & Fenton, N. (1995). Towards a framework for software validation. IEEE Transactions on Software Engineering, 21(12), 929–944.

    Google Scholar 

  • Kitchenham, B., & Pfleeger, S. L. (2003). Principles of survey research part 6: Data analysis, ACM SIGSOFT. Software Engineering Notes, 28(2), 24–27.

  • Kyburg, H. E. (1984). Theory and measurement. Cambridge: Cambridge University Press.

    Google Scholar 

  • Lindley, D. V. (2000). The philosophy of statistics. The Statistician, 49(3), 293–337.

    Google Scholar 

  • Moses, J. (2000). Bayesian probability distributions for assessing subjectivity in the measurement of subjective software attributes. Information and Software Technology, 42(8), 533–546.

    Article  MathSciNet  Google Scholar 

  • Moses, J. (2001). A consideration of the impact of interactions with module effects on the direct measurement of subjective software attributes. In 7th IEEE symposium on software metrics, London, UK, pp. 112–123, April.

  • Roberts, F. S. (1979). Measurement theory, encyclopedia of mathematics and its applications (Vol. 7). Massachusetts: Addison-Wesley Publishing Company.

    Google Scholar 

  • Shepperd, M. (1990). Early life-cycle metrics and software quality models. Information and Software Technology, 32(4), 311–316.

    Article  Google Scholar 

  • Smith, J. Q. (1992). Decision analysis: A Bayesian approach. Chapman and Hall

  • Spiegelhalter, D. J., Thomas, A., Best. N., & Gilks, W. (1996). BUGS 0.5, Bayesian inference using Gibbs sampling manual (version ii). MRC Biostatistics Unit, Cambridge, August.

  • Stevens, W. P., Myers, G. J., & Constantine, L. L. (1974). Structured design. IBM Systems Journal, 13(2), 115–139.

    Google Scholar 

  • Yourdon, E., & Constantine, L. L. (1979). Structured design. Englewood Cliffs, NJ: Prentice.

Download references

Acknowledgements

The author wishes to thank Professor Martin Shepperd of the University of Brunel for access to the maintainability classification data. In addition, the author acknowledges the ESRC and MRC-Cambridge for the use of the BUGS simulation program.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to John Moses.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Moses, J. Benchmarking quality measurement. Software Qual J 15, 449–462 (2007). https://doi.org/10.1007/s11219-007-9025-4

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11219-007-9025-4

Keywords

Navigation