research-article

Comparing the Effectiveness of Using Design and Code Measures in Software Faultiness Estimation

Authors:
Sandro Morasca

Dipartimento di Scienze Teoriche e Applicate, Università degli Studi dell'Insubria, Varese, Italy

Dipartimento di Scienze Teoriche e Applicate, Università degli Studi dell'Insubria, Varese, Italy
View Profile

,
Luigi Lavazza

Dipartimento di Scienze Teoriche e Applicate, Università degli Studi dell'Insubria, Varese, Italy

Dipartimento di Scienze Teoriche e Applicate, Università degli Studi dell'Insubria, Varese, Italy
View Profile

EASE '19: Proceedings of the 23rd International Conference on Evaluation and Assessment in Software EngineeringApril 2019Pages 112–121https://doi.org/10.1145/3319008.3319026

Published:15 April 2019Publication History

EASE '19: Proceedings of the 23rd International Conference on Evaluation and Assessment in Software Engineering

Pages 112–121

ABSTRACT

Background. Early identification of software modules that are likely to be faulty helps practitioners take timely actions to improve these modules' quality and reduce development costs in the remainder of the development process. To this end, module faultiness estimation models can be built at any point during development by using measures collected up to that time. Models available in later phases are expected to be more accurate than those available in earlier phases. However, waiting until late in the development process may reduce the impact of the effectiveness and efficacy of any software quality improvement actions and increase their cost.

Aims. Our goal is to investigate to what extent using software code measures along with software design measures helps improve the accuracy of module faultiness estimation with respect to using software design measures alone.

Method. We built faultiness estimation models---by using Binary Logistic Regression, Naive Bayes, Support Vector Machines, and Decision Trees---for 54 datasets from the PROMISE repository. These datasets contain design and code measures and faultiness data of software modules of real-life projects. We compared the models built by using the code measures and design measures together against the models built by using design measures alone via a few accuracy indicators.

Results. The results indicate that the models built by using code measures and design measures together are only slightly more accurate than the models built by using design measures alone.

Conclusions. Our analysis shows that measures that can be obtained during design can provide models that are almost as accurate as models that can be achieved in later development phases. This is good news for practitioners, who can start early ---hence cheaper and more effective---quality improvement initiatives based on fairly reliable models.

References

2015. The PROMISE Repository of Empirical Software Engineering Data.Google Scholar
A. Agresti. 2007. An introduction to categorical data analysis. Wiley-Blackwell. http://scholar.google.de/scholar.bib?q=info:zgZR_0-o5cUJ:scholar.google.com/&output=citation&hl=de&as_sdt=0,5&ct=citation&cd=0Google Scholar
Pierre Baldi, Søren Brunak, Yves Chauvin, Claus AF Andersen, and Henrik Nielsen. 2000. Assessing the accuracy of prediction algorithms for classification: an overview. Bioinformatics 16, 5 (2000), 412--424.Google ScholarCross Ref
Sarah Beecham, Tracy Hall, David Bowes, David Gray, Steve Counsell, and Sue Black. 2010. A systematic review of fault prediction approaches used in software engineering. Technical Report. Lero.Google Scholar
C. E. Bonferroni. 1936. Teoria statistica delle classi e calcolo delle probabilità. Pubblicazioni del R Istituto Superiore di Scienze Economiche e Commerciali di Firenze 8 (1936), 3--62.Google Scholar
Shyam R. Chidamber and Chris F. Kemerer. 1994. A Metrics Suite for Object Oriented Design. IEEE Trans. on Software Eng. 20, 6 (1994). Google ScholarDigital Library
J.Cohen. 1988. Statistical Power Analysis for the Behavioral Sciences. Lawrence Erlbaum Associates.Google Scholar
Donald E. Farrar and Robert R. Glauber. 1967. Multicollinearity in Regression Analysis: The Problem Revisited. The Review of Economics and Statistics 49, 1 (1967), 92--107. http://www.jstor.org/stable/1937887Google ScholarCross Ref
Norman Fenton and James Bieman. 2014. Software metrics: a rigorous and practical approach. CRC Press. Google ScholarDigital Library
Tracy Hall, Sarah Beecham, David Bowes, David Gray, and Steve Counsell. 2012. A systematic literature review on fault prediction performance in software engineering. IEEE Trans. on Software Eng. 38, 6 (2012). Google ScholarDigital Library
James W. Hardin and Joseph M. Hilbe. 2002. Generalized Estimating Equations. CRC Press, Abingdon.Google Scholar
Larry V. Hedges and Ingram. Olkin. 1985. Statistical methods for meta-analysis / Larry V. Hedges, Ingram Olkin. Academic Press Orlando. xxii, 369 p.: pages. http://www.loc.gov/catdir/toc/els031/84012469.htmlGoogle Scholar
David W Hosmer Jr, Stanley Lemeshow, and Rodney X Sturdivant. 2013. Applied logistic regression. John Wiley & Sons.Google ScholarCross Ref
Eibe Frank Ian H. Witten. 2005. Data mining:practical machine learning tools and techniques (2nd ed ed.). Morgan Kaufman.Google Scholar
Yue Jiang, Bojan Cuki, Tim Menzies, and Nick Bartlow. 2008. Comparing Design and Code Metrics for Software Quality Prediction. In Proceedings of the 4th International Workshop on Predictor Models in Software Engineering (PROMISE '08). ACM, New York, NY, USA, 11--18. Google ScholarDigital Library
David H. Krantz, R. Duncan Luce, Patrick Suppes, and Amos Tversky. 1971. Foundations of Measurement. Vol. 1. Academic Press, San Diego.Google Scholar
Brian W Matthews. 1975. Comparison of the predicted and observed secondary structure of T4 phage lysozyme. Biochimica et Biophysica Acta (BBA)-Protein Structure 405, 2 (1975), 442--451.Google ScholarCross Ref
Sandro Morasca. 2009. A probability-based approach for measuring external attributes of software artifacts. In 3rd Int. Symposium on Empirical Software Engineering and Measurement. IEEE Computer Society. Google ScholarDigital Library
Sandro Morasca and Luigi Lavazza. 2017. Risk-averse slope-based thresholds: Definition and empirical evaluation. Information and Software Technology (2017).Google Scholar
Linda M Ottenstein, Victor B Schneider, and Maurice H Halstead. 1976. Predicting the number of bugs expected in a program module. (1976).Google Scholar
J. Ross Quinlan. 1993. C4.5: Programs for Machine Learning. Morgan Kaufmann. Google ScholarDigital Library
R Core Team. 2014. R: A language and environment for statistical computing. R Foundation for Statistical Computing.Google Scholar
Danijel Radjenović, Marjan Heričko, Richard Torkar, and Aleš Živkovič. 2013. Software fault prediction metrics: A systematic literature review. Information and Software Technology 55, 8 (2013), 1397--1418. Google ScholarDigital Library
C.J. Rijsbergen. 1979. Information Retrieval. Butterworths. Google ScholarDigital Library
J P Shaffer. 1995. Multiple Hypothesis Testing. Annual Review of Psychology 46, 1 (1995), 561--584.Google ScholarCross Ref
Martin Shepperd, David Bowes, and Tracy Hall. 2014. Researcher bias: The use of machine learning in software defect prediction. IEEE Transactions on Software Engineering 40, 6 (2014), 603--616.Google ScholarCross Ref
Marco Torchiano. 2017. Package 'effsize'. (2017).Google Scholar
Ming Zhao, Claes Wohlin, Niclas Ohlsson, and Min Xie. 1998. A comparison between software design and code metrics for the prediction of software fault content. Information & Software Technology 40, 14 (1998), 801--809. Google ScholarDigital Library

Index Terms

Comparing the Effectiveness of Using Design and Code Measures in Software Faultiness Estimation
1. Software and its engineering
  1. Software creation and management
    1. Software development process management
      1. Risk management

Recommendations

Exploring Software Measures to Assess Program Comprehension
ESEM '11: Proceedings of the 2011 International Symposium on Empirical Software Engineering and Measurement

Software measures are often used to assess program comprehension, although their applicability is discussed controversially. Often, their application is based on plausibility arguments, which, however, is not sufficient to decide whether software ...
Read More
An Empirical Evaluation of Distribution-based Thresholds for Internal Software Measures
PROMISE 2016: Proceedings of the The 12th International Conference on Predictive Models and Data Analytics in Software Engineering

Background Setting thresholds is important for the practical use of internal software measures, so software modules can be classified as having either acceptable or unacceptable quality, and software practitioners can take appropriate quality ...
Read More
Deriving models of software fault-proneness
SEKE '02: Proceedings of the 14th international conference on Software engineering and knowledge engineering

The effectiveness of the software testing process is a key issue for meeting the increasing demand of quality without augmenting the overall costs of software development. The estimation of software fault-proneness is important for assessing costs and ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
EASE '19: Proceedings of the 23rd International Conference on Evaluation and Assessment in Software Engineering
April 2019
345 pages
ISBN:9781450371452
DOI:10.1145/3319008
Program Chairs:
Shaukat Ali,
Vahid Garousi
Copyright © 2019 ACM
© 2019 Association for Computing Machinery. ACM acknowledges that this contribution was authored or co-authored by an employee, contractor or affiliate of a national government. As such, the Government retains a nonexclusive, royalty-free right to publish or reproduce this article, or to allow others to do so, for Government purposes only.
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 15 April 2019
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Software faultiness
Software measures
Qualifiers
- research-article
- Research
- Refereed limited
Conference

Acceptance Rates
EASE '19 Paper Acceptance Rate20of73submissions,27%Overall Acceptance Rate71of232submissions,31%
More
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 1
  Total Citations
  View Citations
- 90
  Total Downloads
- Downloads (Last 12 months)3
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Comparing the Effectiveness of Using Design and Code Measures in Software Faultiness Estimation

EASE '19: Proceedings of the 23rd International Conference on Evaluation and Assessment in Software Engineering

ABSTRACT

References

Cited By

Index Terms

Recommendations

Exploring Software Measures to Assess Program Comprehension

An Empirical Evaluation of Distribution-based Thresholds for Internal Software Measures

Deriving models of software fault-proneness