research-article

Towards Confidence with Capture-recapture Estimation: An Exploratory Study of Dependence within Inspections

Authors:
Guoping Rong

Software Institute, Nanjing University, China

Software Institute, Nanjing University, China
View Profile

,
Bohan Liu

Software Institute, Nanjing University, China

Software Institute, Nanjing University, China
View Profile

,
He Zhang

Software Institute, Nanjing University, China

Software Institute, Nanjing University, China
View Profile

,
Qiuping Zhang

Department of Computer Science and Technology, Nanjing University, China

Department of Computer Science and Technology, Nanjing University, China
View Profile

,
Dong Shao

Software Institute, Nanjing University, China

Software Institute, Nanjing University, China
View Profile

EASE '17: Proceedings of the 21st International Conference on Evaluation and Assessment in Software EngineeringJune 2017Pages 242–251https://doi.org/10.1145/3084226.3084250

Published:15 June 2017Publication History

EASE '17: Proceedings of the 21st International Conference on Evaluation and Assessment in Software Engineering

Pages 242–251

ABSTRACT

Background: Capture-ReCapture (CRC), as a technique for post-inspection defect estimation, has been studied in Software Engineering (SE) community since 1990s. While most studies focused on the performance evaluation of various CRC models and estimators, few have been done on the assessment of the credibility of estimation results, rendering the difficulty of decision-making for quality management when applying CRC for defect estimation. Objective: This research aims to explore and investigate a reliable and practical approach to assess the credibility of CRC based defect estimation. Method: One fundamental assumption of applying CRC method is the statistical independence of samples that can be measured by 'Coefficient of CoVariation' (CCV). We applied CCV as an indicator of the statistical dependence between the observations (i.e., the defects detected by inspectors), and assessed the estimation results of CRC with the published datasets in SE literature by examining the correlation between Relative Error (RE) and CCV. Based on the observed correlation, we further propose CĈV, which replaces the unknown N (the actual number of defects) with the estimated number (N), to assess the credibility of CRC estimates. Results: We found that most datasets are with non-zero CCVs and the R2 (Coefficient of Determination) of non-linear curve-fitting for their CCVs and REs is higher than 0.8. Conclusions: Our study shows the evidence that the statistical dependence among inspectors is ubiquitous in the existing CRC-related studies. Besides, the significant correlation between CCV (by CĈV in practice) and RE may enable the possibility of the assessment of CRC-based estimation in support of quality management.

References

A. Bachmann and A. Bernstein. Software process data quality and characteristics: a historical view on open and closed source projects. In Proceedings of the joint international and annual ERCIM workshops on Principles of software evolution (IWPSE) and software evolution (Evol) workshops, pages 119--128. ACM, 2009. Google ScholarDigital Library
S. Biffl. Evaluating defect estimation models with major defects. Journal of Systems and Software, 65(1):13--29, 2003. Google ScholarDigital Library
L. Briand, K. E. Emam, O. Laitenberger, and T. Fussbroich. Using simulation to build inspection efficiency benchmarks fordevelopment projects. In International Conference on Software Engineering, pages 340--349, 1998. Google ScholarDigital Library
L. C. Briand, K. El Emam, B. G. Freimut, and O. Laitenberger. A comprehensive evaluation of capture-recapture models for estimating software defect content. Software Engineering, IEEE Transactions on, 26(6):518--540, 2000. Google ScholarDigital Library
K. P. Burnham and W. S. Overton. Estimation of the size of a closed population when capture probabilities vary among animals. Biometrika, 65(3):625--633, 1978.Google ScholarCross Ref
A. Chao. Estimating the population size for capture-recapture data with unequal catchability. Biometrics, pages 783--791, 1987.Google Scholar
A. Chao. Estimating population size for sparse data in capture-recapture experiments. Biometrics, pages 427--438, 1989.Google Scholar
A. Chao. Capture-recapture for human populations. Wiley StatsRef: Statistics Reference Online, 2015.Google ScholarCross Ref
A. Chao, W.-H. Hwang, Y. Chen, and C. Kuo. Estimating the number of shared species in two communities. Statistica sinica, 10(1):227--246, 2000.Google Scholar
A. Chao, P. Tsay, S.-H. Lin, W.-Y. Shau, and D.-Y. Chao. The applications of capture-recapture models to epidemiological data. Statistics in medicine, 20(20):3123--3157, 2001.Google ScholarCross Ref
Y. H. Chun. Estimating the number of undetected software errors via the correlated capture--recapture model. European Journal of Operational Research, 175(2):1180--1192, 2006.Google ScholarCross Ref
M. D'Ambros, M. Lanza, and R. Robbes. An extensive comparison of bug prediction approaches. In Mining Software Repositories (MSR), 2010 7th IEEE Working Conference on, pages 31--41. IEEE, 2010.Google ScholarCross Ref
N. B. Ebrahimi. On the statistical analysis of the number of errors remaining in a software design document after inspection. Software Engineering, IEEE Transactions on, 23(8):529--532, 1997. Google ScholarDigital Library
K. E. Emam and O. Laitenberger. Evaluating capture-recapture models with two inspectors. Software Engineering, IEEE Transactions on, 27(9):851--864, 2001. Google ScholarDigital Library
N. E. Fenton and M. Neil. A critique of software defect prediction models. Software Engineering, IEEE Transactions on, 25(5):675--689, 1999. Google ScholarDigital Library
A. Kamel and P. G. Sorenson. The application of capture-recapture log-linear models to software inspections data. IEEE, pages 213--222, 2003. Google ScholarDigital Library
T. Lee, J. Nam, D. Han, S. Kim, and H. P. In. Micro interaction metrics for defect prediction. In Proceedings of the 19th ACM SIGSOFT symposium and the 13th European conference on Foundations of software engineering, pages 311--321. ACM, 2011. Google ScholarDigital Library
F. C. Lincoln. Calculating waterfowl abundance on the basis of banding returns. Us Department of Agriculture Circular, 1930.Google Scholar
G. Liu, G. Rong, H. Zhang, and Q. Shan. The adoption of capture-recapture in software engineering: a systematic literature review. In Proceedings of the 19th International Conference on Evaluation and Assessment in Software Engineering, page 15. ACM, 2015. Google ScholarDigital Library
J. Miller. On the independence of software inspectors. Journal of Systems and Software, 60(1):5--10, 2002. Google ScholarDigital Library
F. Padberg. Empirical interval estimates for the defect content after an inspection. In Proceedings of the 24th International Conference on Software Engineering, pages 58--68. ACM, 2002. Google ScholarDigital Library
H. Petersson, T. Thelin, P. Runeson, and C. Wohlin. Capture--recapture in software inspections after 10 years research----theory, evaluation and application. Journal of Systems and Software, 72(2):249--264, 2004.Google ScholarCross Ref
H. Petersson and C. Wohlin. An empirical study of experience-based software defect content estimation methods. In Software Reliability Engineering, 1999. Proceedings. 10th International Symposium on, pages 126--135. IEEE, 1999. Google ScholarDigital Library
S. C. Robles, L. D. Marrett, E. A. Clarke, and H. A. Risch. An application of capture-recapture methods to the estimation of completeness of cancer registration. Journal of clinical epidemiology, 41(5):495--501, 1988.Google Scholar
P. Runeson and C. Wohlin. An experimental evaluation of an experience-based capture-recapture method in software code inspections. Empirical Software Engineering, 3(4):381--406, 1998. Google ScholarDigital Library
Q. Shan, G. Rong, H. Zhang, G. Liu, and D. Shao. An empirical evaluation of capture-recapture estimators in software inspection. In Proceddings of the 24th Australasian Software Engineering Conference. IEEE, 2015. Google ScholarDigital Library
K. Srinivasan and D. Fisher. Machine learning approaches to estimating software development effort. Software Engineering, IEEE Transactions on, 21(2):126--137, 1995. Google ScholarDigital Library
T. Thelin and P. Runeson. Confidence intervals for capture--recapture estimations in software inspections. Information and Software Technology, 44(12):683--702, 2002.Google ScholarCross Ref
G. S. Walia and J. C. Carver. Evaluation of capture-recapture models for estimating the abundance of naturally-occurring defects. In Proceedings of the Second ACM-IEEE international symposium on Empirical software engineering and measurement, pages 158--167. ACM, 2008. Google ScholarDigital Library
G. S. Walia, J. C. Carver, and N. Nagappan. The effect of the number of inspectors on the defect estimates produced by capture-recapture models. In Proceedings of the 30th international conference on Software engineering, pages 331--340. ACM, 2008. Google ScholarDigital Library
J. Wittes and V. W. Sidel. A generalization of the simple capture-recapture model with applications to epidemiological research. Journal of chronic diseases, 21(5):287--301, 1968.Google Scholar
Q. Zhang, G. Rong, and H. Zhang. An empirical study on independence-driven data selection for improving capture-recapture estimation. In Proceedings of the 20th International Conference on Evaluation and Assessment in Software Engineering, page 19. ACM, 2016. Google ScholarDigital Library

Recommendations

The adoption of capture-recapture in software engineering: a systematic literature review
EASE '15: Proceedings of the 19th International Conference on Evaluation and Assessment in Software Engineering

Context: Capture-recapture method has long been adopted in software engineering as a relatively objective way for defect estimation. While many relevant studies have been carried out to evaluate various capture-recapture models and estimators, there ...
Read More
An empirical study on independence-driven data selection for improving capture-recapture estimation
EASE '16: Proceedings of the 20th International Conference on Evaluation and Assessment in Software Engineering

Background: The Capture-recapture (CRC) method has been adopted in software inspection post-inspection defect estimation. One outstanding advantage of the CRC method is that it is able to produce objective estimates without relying on historical data. ...
Read More
Evaluating Capture-Recapture Models with Two Inspectors

Capture-recapture (CR) models have been proposed as an objective method for controlling software inspections. CR models were originally developed to estimate the size of animal populations. In software, they have been used to estimate the number of ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
EASE '17: Proceedings of the 21st International Conference on Evaluation and Assessment in Software Engineering
June 2017
405 pages
ISBN:9781450348041
DOI:10.1145/3084226
Conference Chair:
Emilia Mendes,
Program Chairs:
Steve Counsell,
Kai Petersen
Copyright © 2017 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 15 June 2017
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
capture-recapture
coefficient of covariation
defect estimation
statistical independence
Qualifiers
- research-article
- Research
- Refereed limited
Conference

Acceptance Rates
Overall Acceptance Rate71of232submissions,31%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 2
  Total Citations
  View Citations
- 99
  Total Downloads
- Downloads (Last 12 months)4
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Towards Confidence with Capture-recapture Estimation: An Exploratory Study of Dependence within Inspections

EASE '17: Proceedings of the 21st International Conference on Evaluation and Assessment in Software Engineering

ABSTRACT

References

Cited By

Recommendations

The adoption of capture-recapture in software engineering: a systematic literature review

An empirical study on independence-driven data selection for improving capture-recapture estimation

Evaluating Capture-Recapture Models with Two Inspectors

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Towards Confidence with Capture-recapture Estimation: An Exploratory Study of Dependence within Inspections

EASE '17: Proceedings of the 21st International Conference on Evaluation and Assessment in Software Engineering

ABSTRACT

References

Cited By

Recommendations

The adoption of capture-recapture in software engineering: a systematic literature review

An empirical study on independence-driven data selection for improving capture-recapture estimation

Evaluating Capture-Recapture Models with Two Inspectors

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media