ABSTRACT
Background: Capture-ReCapture (CRC), as a technique for post-inspection defect estimation, has been studied in Software Engineering (SE) community since 1990s. While most studies focused on the performance evaluation of various CRC models and estimators, few have been done on the assessment of the credibility of estimation results, rendering the difficulty of decision-making for quality management when applying CRC for defect estimation. Objective: This research aims to explore and investigate a reliable and practical approach to assess the credibility of CRC based defect estimation. Method: One fundamental assumption of applying CRC method is the statistical independence of samples that can be measured by 'Coefficient of CoVariation' (CCV). We applied CCV as an indicator of the statistical dependence between the observations (i.e., the defects detected by inspectors), and assessed the estimation results of CRC with the published datasets in SE literature by examining the correlation between Relative Error (RE) and CCV. Based on the observed correlation, we further propose CĈV, which replaces the unknown N (the actual number of defects) with the estimated number (N), to assess the credibility of CRC estimates. Results: We found that most datasets are with non-zero CCVs and the R2 (Coefficient of Determination) of non-linear curve-fitting for their CCVs and REs is higher than 0.8. Conclusions: Our study shows the evidence that the statistical dependence among inspectors is ubiquitous in the existing CRC-related studies. Besides, the significant correlation between CCV (by CĈV in practice) and RE may enable the possibility of the assessment of CRC-based estimation in support of quality management.
- A. Bachmann and A. Bernstein. Software process data quality and characteristics: a historical view on open and closed source projects. In Proceedings of the joint international and annual ERCIM workshops on Principles of software evolution (IWPSE) and software evolution (Evol) workshops, pages 119--128. ACM, 2009. Google ScholarDigital Library
- S. Biffl. Evaluating defect estimation models with major defects. Journal of Systems and Software, 65(1):13--29, 2003. Google ScholarDigital Library
- L. Briand, K. E. Emam, O. Laitenberger, and T. Fussbroich. Using simulation to build inspection efficiency benchmarks fordevelopment projects. In International Conference on Software Engineering, pages 340--349, 1998. Google ScholarDigital Library
- L. C. Briand, K. El Emam, B. G. Freimut, and O. Laitenberger. A comprehensive evaluation of capture-recapture models for estimating software defect content. Software Engineering, IEEE Transactions on, 26(6):518--540, 2000. Google ScholarDigital Library
- K. P. Burnham and W. S. Overton. Estimation of the size of a closed population when capture probabilities vary among animals. Biometrika, 65(3):625--633, 1978.Google ScholarCross Ref
- A. Chao. Estimating the population size for capture-recapture data with unequal catchability. Biometrics, pages 783--791, 1987.Google Scholar
- A. Chao. Estimating population size for sparse data in capture-recapture experiments. Biometrics, pages 427--438, 1989.Google Scholar
- A. Chao. Capture-recapture for human populations. Wiley StatsRef: Statistics Reference Online, 2015.Google ScholarCross Ref
- A. Chao, W.-H. Hwang, Y. Chen, and C. Kuo. Estimating the number of shared species in two communities. Statistica sinica, 10(1):227--246, 2000.Google Scholar
- A. Chao, P. Tsay, S.-H. Lin, W.-Y. Shau, and D.-Y. Chao. The applications of capture-recapture models to epidemiological data. Statistics in medicine, 20(20):3123--3157, 2001.Google ScholarCross Ref
- Y. H. Chun. Estimating the number of undetected software errors via the correlated capture--recapture model. European Journal of Operational Research, 175(2):1180--1192, 2006.Google ScholarCross Ref
- M. D'Ambros, M. Lanza, and R. Robbes. An extensive comparison of bug prediction approaches. In Mining Software Repositories (MSR), 2010 7th IEEE Working Conference on, pages 31--41. IEEE, 2010.Google ScholarCross Ref
- N. B. Ebrahimi. On the statistical analysis of the number of errors remaining in a software design document after inspection. Software Engineering, IEEE Transactions on, 23(8):529--532, 1997. Google ScholarDigital Library
- K. E. Emam and O. Laitenberger. Evaluating capture-recapture models with two inspectors. Software Engineering, IEEE Transactions on, 27(9):851--864, 2001. Google ScholarDigital Library
- N. E. Fenton and M. Neil. A critique of software defect prediction models. Software Engineering, IEEE Transactions on, 25(5):675--689, 1999. Google ScholarDigital Library
- A. Kamel and P. G. Sorenson. The application of capture-recapture log-linear models to software inspections data. IEEE, pages 213--222, 2003. Google ScholarDigital Library
- T. Lee, J. Nam, D. Han, S. Kim, and H. P. In. Micro interaction metrics for defect prediction. In Proceedings of the 19th ACM SIGSOFT symposium and the 13th European conference on Foundations of software engineering, pages 311--321. ACM, 2011. Google ScholarDigital Library
- F. C. Lincoln. Calculating waterfowl abundance on the basis of banding returns. Us Department of Agriculture Circular, 1930.Google Scholar
- G. Liu, G. Rong, H. Zhang, and Q. Shan. The adoption of capture-recapture in software engineering: a systematic literature review. In Proceedings of the 19th International Conference on Evaluation and Assessment in Software Engineering, page 15. ACM, 2015. Google ScholarDigital Library
- J. Miller. On the independence of software inspectors. Journal of Systems and Software, 60(1):5--10, 2002. Google ScholarDigital Library
- F. Padberg. Empirical interval estimates for the defect content after an inspection. In Proceedings of the 24th International Conference on Software Engineering, pages 58--68. ACM, 2002. Google ScholarDigital Library
- H. Petersson, T. Thelin, P. Runeson, and C. Wohlin. Capture--recapture in software inspections after 10 years research----theory, evaluation and application. Journal of Systems and Software, 72(2):249--264, 2004.Google ScholarCross Ref
- H. Petersson and C. Wohlin. An empirical study of experience-based software defect content estimation methods. In Software Reliability Engineering, 1999. Proceedings. 10th International Symposium on, pages 126--135. IEEE, 1999. Google ScholarDigital Library
- S. C. Robles, L. D. Marrett, E. A. Clarke, and H. A. Risch. An application of capture-recapture methods to the estimation of completeness of cancer registration. Journal of clinical epidemiology, 41(5):495--501, 1988.Google Scholar
- P. Runeson and C. Wohlin. An experimental evaluation of an experience-based capture-recapture method in software code inspections. Empirical Software Engineering, 3(4):381--406, 1998. Google ScholarDigital Library
- Q. Shan, G. Rong, H. Zhang, G. Liu, and D. Shao. An empirical evaluation of capture-recapture estimators in software inspection. In Proceddings of the 24th Australasian Software Engineering Conference. IEEE, 2015. Google ScholarDigital Library
- K. Srinivasan and D. Fisher. Machine learning approaches to estimating software development effort. Software Engineering, IEEE Transactions on, 21(2):126--137, 1995. Google ScholarDigital Library
- T. Thelin and P. Runeson. Confidence intervals for capture--recapture estimations in software inspections. Information and Software Technology, 44(12):683--702, 2002.Google ScholarCross Ref
- G. S. Walia and J. C. Carver. Evaluation of capture-recapture models for estimating the abundance of naturally-occurring defects. In Proceedings of the Second ACM-IEEE international symposium on Empirical software engineering and measurement, pages 158--167. ACM, 2008. Google ScholarDigital Library
- G. S. Walia, J. C. Carver, and N. Nagappan. The effect of the number of inspectors on the defect estimates produced by capture-recapture models. In Proceedings of the 30th international conference on Software engineering, pages 331--340. ACM, 2008. Google ScholarDigital Library
- J. Wittes and V. W. Sidel. A generalization of the simple capture-recapture model with applications to epidemiological research. Journal of chronic diseases, 21(5):287--301, 1968.Google Scholar
- Q. Zhang, G. Rong, and H. Zhang. An empirical study on independence-driven data selection for improving capture-recapture estimation. In Proceedings of the 20th International Conference on Evaluation and Assessment in Software Engineering, page 19. ACM, 2016. Google ScholarDigital Library
Recommendations
The adoption of capture-recapture in software engineering: a systematic literature review
EASE '15: Proceedings of the 19th International Conference on Evaluation and Assessment in Software EngineeringContext: Capture-recapture method has long been adopted in software engineering as a relatively objective way for defect estimation. While many relevant studies have been carried out to evaluate various capture-recapture models and estimators, there ...
An empirical study on independence-driven data selection for improving capture-recapture estimation
EASE '16: Proceedings of the 20th International Conference on Evaluation and Assessment in Software EngineeringBackground: The Capture-recapture (CRC) method has been adopted in software inspection post-inspection defect estimation. One outstanding advantage of the CRC method is that it is able to produce objective estimates without relying on historical data. ...
Evaluating Capture-Recapture Models with Two Inspectors
Capture-recapture (CR) models have been proposed as an objective method for controlling software inspections. CR models were originally developed to estimate the size of animal populations. In software, they have been used to estimate the number of ...
Comments