skip to main content
10.1145/3084226.3084250acmotherconferencesArticle/Chapter ViewAbstractPublication PageseaseConference Proceedingsconference-collections
research-article

Towards Confidence with Capture-recapture Estimation: An Exploratory Study of Dependence within Inspections

Authors Info & Claims
Published:15 June 2017Publication History

ABSTRACT

Background: Capture-ReCapture (CRC), as a technique for post-inspection defect estimation, has been studied in Software Engineering (SE) community since 1990s. While most studies focused on the performance evaluation of various CRC models and estimators, few have been done on the assessment of the credibility of estimation results, rendering the difficulty of decision-making for quality management when applying CRC for defect estimation. Objective: This research aims to explore and investigate a reliable and practical approach to assess the credibility of CRC based defect estimation. Method: One fundamental assumption of applying CRC method is the statistical independence of samples that can be measured by 'Coefficient of CoVariation' (CCV). We applied CCV as an indicator of the statistical dependence between the observations (i.e., the defects detected by inspectors), and assessed the estimation results of CRC with the published datasets in SE literature by examining the correlation between Relative Error (RE) and CCV. Based on the observed correlation, we further propose CĈV, which replaces the unknown N (the actual number of defects) with the estimated number (N), to assess the credibility of CRC estimates. Results: We found that most datasets are with non-zero CCVs and the R2 (Coefficient of Determination) of non-linear curve-fitting for their CCVs and REs is higher than 0.8. Conclusions: Our study shows the evidence that the statistical dependence among inspectors is ubiquitous in the existing CRC-related studies. Besides, the significant correlation between CCV (by CĈV in practice) and RE may enable the possibility of the assessment of CRC-based estimation in support of quality management.

References

  1. A. Bachmann and A. Bernstein. Software process data quality and characteristics: a historical view on open and closed source projects. In Proceedings of the joint international and annual ERCIM workshops on Principles of software evolution (IWPSE) and software evolution (Evol) workshops, pages 119--128. ACM, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. S. Biffl. Evaluating defect estimation models with major defects. Journal of Systems and Software, 65(1):13--29, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. L. Briand, K. E. Emam, O. Laitenberger, and T. Fussbroich. Using simulation to build inspection efficiency benchmarks fordevelopment projects. In International Conference on Software Engineering, pages 340--349, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. L. C. Briand, K. El Emam, B. G. Freimut, and O. Laitenberger. A comprehensive evaluation of capture-recapture models for estimating software defect content. Software Engineering, IEEE Transactions on, 26(6):518--540, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. K. P. Burnham and W. S. Overton. Estimation of the size of a closed population when capture probabilities vary among animals. Biometrika, 65(3):625--633, 1978.Google ScholarGoogle ScholarCross RefCross Ref
  6. A. Chao. Estimating the population size for capture-recapture data with unequal catchability. Biometrics, pages 783--791, 1987.Google ScholarGoogle Scholar
  7. A. Chao. Estimating population size for sparse data in capture-recapture experiments. Biometrics, pages 427--438, 1989.Google ScholarGoogle Scholar
  8. A. Chao. Capture-recapture for human populations. Wiley StatsRef: Statistics Reference Online, 2015.Google ScholarGoogle ScholarCross RefCross Ref
  9. A. Chao, W.-H. Hwang, Y. Chen, and C. Kuo. Estimating the number of shared species in two communities. Statistica sinica, 10(1):227--246, 2000.Google ScholarGoogle Scholar
  10. A. Chao, P. Tsay, S.-H. Lin, W.-Y. Shau, and D.-Y. Chao. The applications of capture-recapture models to epidemiological data. Statistics in medicine, 20(20):3123--3157, 2001.Google ScholarGoogle ScholarCross RefCross Ref
  11. Y. H. Chun. Estimating the number of undetected software errors via the correlated capture--recapture model. European Journal of Operational Research, 175(2):1180--1192, 2006.Google ScholarGoogle ScholarCross RefCross Ref
  12. M. D'Ambros, M. Lanza, and R. Robbes. An extensive comparison of bug prediction approaches. In Mining Software Repositories (MSR), 2010 7th IEEE Working Conference on, pages 31--41. IEEE, 2010.Google ScholarGoogle ScholarCross RefCross Ref
  13. N. B. Ebrahimi. On the statistical analysis of the number of errors remaining in a software design document after inspection. Software Engineering, IEEE Transactions on, 23(8):529--532, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. K. E. Emam and O. Laitenberger. Evaluating capture-recapture models with two inspectors. Software Engineering, IEEE Transactions on, 27(9):851--864, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. N. E. Fenton and M. Neil. A critique of software defect prediction models. Software Engineering, IEEE Transactions on, 25(5):675--689, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. A. Kamel and P. G. Sorenson. The application of capture-recapture log-linear models to software inspections data. IEEE, pages 213--222, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. T. Lee, J. Nam, D. Han, S. Kim, and H. P. In. Micro interaction metrics for defect prediction. In Proceedings of the 19th ACM SIGSOFT symposium and the 13th European conference on Foundations of software engineering, pages 311--321. ACM, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. F. C. Lincoln. Calculating waterfowl abundance on the basis of banding returns. Us Department of Agriculture Circular, 1930.Google ScholarGoogle Scholar
  19. G. Liu, G. Rong, H. Zhang, and Q. Shan. The adoption of capture-recapture in software engineering: a systematic literature review. In Proceedings of the 19th International Conference on Evaluation and Assessment in Software Engineering, page 15. ACM, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. J. Miller. On the independence of software inspectors. Journal of Systems and Software, 60(1):5--10, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. F. Padberg. Empirical interval estimates for the defect content after an inspection. In Proceedings of the 24th International Conference on Software Engineering, pages 58--68. ACM, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. H. Petersson, T. Thelin, P. Runeson, and C. Wohlin. Capture--recapture in software inspections after 10 years research----theory, evaluation and application. Journal of Systems and Software, 72(2):249--264, 2004.Google ScholarGoogle ScholarCross RefCross Ref
  23. H. Petersson and C. Wohlin. An empirical study of experience-based software defect content estimation methods. In Software Reliability Engineering, 1999. Proceedings. 10th International Symposium on, pages 126--135. IEEE, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. S. C. Robles, L. D. Marrett, E. A. Clarke, and H. A. Risch. An application of capture-recapture methods to the estimation of completeness of cancer registration. Journal of clinical epidemiology, 41(5):495--501, 1988.Google ScholarGoogle Scholar
  25. P. Runeson and C. Wohlin. An experimental evaluation of an experience-based capture-recapture method in software code inspections. Empirical Software Engineering, 3(4):381--406, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Q. Shan, G. Rong, H. Zhang, G. Liu, and D. Shao. An empirical evaluation of capture-recapture estimators in software inspection. In Proceddings of the 24th Australasian Software Engineering Conference. IEEE, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. K. Srinivasan and D. Fisher. Machine learning approaches to estimating software development effort. Software Engineering, IEEE Transactions on, 21(2):126--137, 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. T. Thelin and P. Runeson. Confidence intervals for capture--recapture estimations in software inspections. Information and Software Technology, 44(12):683--702, 2002.Google ScholarGoogle ScholarCross RefCross Ref
  29. G. S. Walia and J. C. Carver. Evaluation of capture-recapture models for estimating the abundance of naturally-occurring defects. In Proceedings of the Second ACM-IEEE international symposium on Empirical software engineering and measurement, pages 158--167. ACM, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. G. S. Walia, J. C. Carver, and N. Nagappan. The effect of the number of inspectors on the defect estimates produced by capture-recapture models. In Proceedings of the 30th international conference on Software engineering, pages 331--340. ACM, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. J. Wittes and V. W. Sidel. A generalization of the simple capture-recapture model with applications to epidemiological research. Journal of chronic diseases, 21(5):287--301, 1968.Google ScholarGoogle Scholar
  32. Q. Zhang, G. Rong, and H. Zhang. An empirical study on independence-driven data selection for improving capture-recapture estimation. In Proceedings of the 20th International Conference on Evaluation and Assessment in Software Engineering, page 19. ACM, 2016. Google ScholarGoogle ScholarDigital LibraryDigital Library

Recommendations

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Sign in
  • Published in

    cover image ACM Other conferences
    EASE '17: Proceedings of the 21st International Conference on Evaluation and Assessment in Software Engineering
    June 2017
    405 pages
    ISBN:9781450348041
    DOI:10.1145/3084226

    Copyright © 2017 ACM

    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    • Published: 15 June 2017

    Permissions

    Request permissions about this article.

    Request Permissions

    Check for updates

    Qualifiers

    • research-article
    • Research
    • Refereed limited

    Acceptance Rates

    Overall Acceptance Rate71of232submissions,31%

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader