Article

Correlation exploitation in error ranking

Authors:

Dawson EnglerAuthors Info & Claims

SIGSOFT '04/FSE-12: Proceedings of the 12th ACM SIGSOFT twelfth international symposium on Foundations of software engineering

Pages 83 - 93

https://doi.org/10.1145/1029894.1029909

Published: 31 October 2004 Publication History

Abstract

Static program checking tools can find many serious bugs in software, but due to analysis limitations they also frequently emit false error reports. Such false positives can easily render the error checker useless by hiding real errors amidst the false. Effective error report ranking schemes mitigate the problem of false positives by suppressing them during the report inspection process [17, 19, 20]. In this way, ranking techniques provide a complementary method to increasing the precision of the analysis results of a checking tool. A weakness of previous ranking schemes, however, is that they produce static rankings that do not adapt as reports are inspected, ignoring useful correlations amongst reports. This paper addresses this weakness with two main contributions. First, we observe that both bugs and false positives frequently cluster by code locality. We analyze clustering behavior in historical bug data from two large systems and show how clustering can be exploited to greatly improve error report ranking. Second, we present a general probabilistic technique for error ranking that (1) exploits correlation behavior amongst reports and (2) incorporates user feedback into the ranking process. In our results we observe a factor of 2-8 improvement over randomized ranking for error reports emitted by both intra-procedural and inter-procedural analysis tools.

References

[1]

A. Aiken, M. Faehndrich, and Z. Su. Detecting races in relay ladder logic programs. In TACAS 1998, 1998.

Digital Library

[2]

T. Ball, M. Naik, and S. K. Rajamani. From symptom to cause: localizing errors in counterexample traces. In POPL 2003, 2003.

Digital Library

[3]

T. Ball and S. Rajamani. Automatically validating temporal safety properties of interfaces. In SPIN 2001 Workshop on Model Checking of Software, May 2001.

Digital Library

[4]

W. Bush, J. Pincus, and D. Sielaff. A static analyzer for finding dynamic programming errors. Software: Practice and Experience, 30(7):775--802, 2000.

Digital Library

[5]

H. Chen and D. Wagner. MOPS: an infrastructure for examining security properties of software. In Proceedings of the 9th ACM conference on Computer and communications security, pages 235 -- 244. ACM Press, 2002.

Digital Library

[6]

A. Chou, J. Yang, B. Chelf, S. Hallem, and D. Engler. An empirical study of operating systems errors. In SOSP 2001.

Digital Library

[7]

T. M. Cover and J. A. Thomas. Elements of Information Theory. Wiley Interscience, 1991.

Digital Library

[8]

R. G. Cowell, A. P. Dawid, S. L. Lauritzen, and D. J. Spiegelhalter. Probabilistic Networks and Expert Systems. Springer, 1999.

Digital Library

[9]

M. Das, S. Lerner, and M. Seigle. Path-sensitive program verification in polynomial time. In PLDI 2002, 2002.

Digital Library

[10]

D. Engler, B. Chelf, A. Chou, and S. Hallem. Checking system rules using system-specific, programmer-written compiler extensions. In OSDI 2000, 2000.

Digital Library

[11]

D. Evans, J. Guttag, J. Horning, and Y. Tan. LCLint: A tool for using specifications to check code. In FSE 1994, 1994.

Digital Library

[12]

D. Evans and D. Larochelle. Improving security using extensible lightweight static analysis. IEEE Software, 19(1):42--51, January/February 2002.

Digital Library

[13]

C. Flanagan and S. N. Freund. Type-based race detection for Java. In PLDI 2000, 2000.

Digital Library

[14]

C. Flanagan, K. Leino, M. Lillibridge, G. Nelson, J. Saxe, and R. Stata. Extended static checking for Java. In PLDI 2002, pages 234--245. ACM Press, 2002.

Digital Library

[15]

J. Foster, T. Terauchi, and A. Aiken. Flow-sensitive type qualifiers. In PLDI 2002, 2002.

Digital Library

[16]

P. Good. Permutation Tests: A Practical Guide to Resampling Methods for Testing Hypotheses. Springer, 2000.

[17]

S. Hallem, B. Chelf, Y. Xie, and D. Engler. A system and language for building system-specific, static analyses. In PLDI 2002, 2002.

Digital Library

[18]

S. Hangal and M. S. Lam. Tracking down software bugs using automatic anomaly detection. In International Conference on Software Engineering, May 2002.

Digital Library

[19]

Intrinsa. A technical introduction to PREfix/Enterprise. Technical report, Intrinsa Corporation, 1998.

[20]

T. Kremenek and D. Engler. Z-Ranking: Using statistical analysis to counter the impact of static analysis approximations. In SAS 2003, 2003.

Digital Library

[21]

B. Liblit, A. Aiken, A. X. Zheng, and M. I. Jordan. Bug isolation via remote program sampling. In PLDI 2003, 2003.

Digital Library

[22]

T. J. Ostrand, E. J. Weyuker, and R. M. Bell. Where the bugs are. In ISSTA 2004, 2004.

Digital Library

[23]

U. Shankar, K. Talwar, J. S. Foster, and D. Wagner. Detecting format string vulnerabilities with type qualifiers. In Proceedings of the 10th USENIX Security Symposium, 2001.

Digital Library

[24]

D. Wagner, J. Foster, E. Brewer, and A. Aiken. A first step towards automated detection of buffer overrun vulnerabilities. In 2000 NDSSC, Feb. 2000.

[25]

J. S. Yedidia, W. T. Freeman, and Y. Weiss. Understanding belief propagation and its generalizations. In Exploring Artificial Intelligence in the New Millennium, pages 239--269. Morgan Kaufmann Publishers Inc., 2003.

Digital Library

Cited By

Yin SGuo SLi HLi CChen RLi XJiang H(2025)Line-Level Defect Prediction by Capturing Code Contexts With Graph Convolutional NetworksIEEE Transactions on Software Engineering10.1109/TSE.2024.350372351:1(172-191)Online publication date: 1-Jan-2025
https://dl.acm.org/doi/10.1109/TSE.2024.3503723
Guo ZTan TLiu SLiu XLai WYang YLi YChen LDong WZhou Y(2023)Mitigating False Positive Static Analysis Warnings: Progress, Challenges, and OpportunitiesIEEE Transactions on Software Engineering10.1109/TSE.2023.332966749:12(5154-5188)Online publication date: 1-Dec-2023
https://dl.acm.org/doi/10.1109/TSE.2023.3329667
Motwani MBrun YLiu AMuccini H(2023)Understanding Why and Predicting When Developers Adhere to Code-Quality StandardsProceedings of the 45th International Conference on Software Engineering: Software Engineering in Practice10.1109/ICSE-SEIP58684.2023.00045(432-444)Online publication date: 17-May-2023
https://dl.acm.org/doi/10.1109/ICSE-SEIP58684.2023.00045
Show More Cited By

Index Terms

Recommendations

Correlation exploitation in error ranking

Static program checking tools can find many serious bugs in software, but due to analysis limitations they also frequently emit false error reports. Such false positives can easily render the error checker useless by hiding real errors amidst the false. ...
Efficient static analysis with path pruning using coverage data

Soundness and completeness are two primary concerns of a static analysis tool for finding defects in software. Exhaustive static analysis of the program through all paths is not always possible, especially for a large software causing incompleteness in ...
Efficient static analysis with path pruning using coverage data
WODA '05: Proceedings of the third international workshop on Dynamic analysis

Soundness and completeness are two primary concerns of a static analysis tool for finding defects in software. Exhaustive static analysis of the program through all paths is not always possible, especially for a large software causing incompleteness in ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

SIGSOFT '04/FSE-12: Proceedings of the 12th ACM SIGSOFT twelfth international symposium on Foundations of software engineering

October 2004

282 pages

ISBN:1581138555

DOI:10.1145/1029894

General Chair:
Richard N. Taylor
University of California, Irvine, CA
,
Program Chair:
Matthew B. Dwyer
University of Nebraska - Lincoln

ACM SIGSOFT Software Engineering Notes Volume 29, Issue 6
November 2004
275 pages
ISSN:0163-5948
DOI:10.1145/1041685
Issue’s Table of Contents

Copyright © 2004 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 31 October 2004

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Article

Conference

SIGSOFT04/FSE-12

Sponsor:

SIGSOFT04/FSE-12: SIGSOFT 2004 -12th International Symposium on the Foundations of Software Engineering

October 31 - November 6, 2004

CA, Newport Beach, USA

Acceptance Rates

Overall Acceptance Rate 17 of 128 submissions, 13%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

120
Total Citations
View Citations
615
Total Downloads

Downloads (Last 12 months)13
Downloads (Last 6 weeks)0

Reflects downloads up to 01 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Yin SGuo SLi HLi CChen RLi XJiang H(2025)Line-Level Defect Prediction by Capturing Code Contexts With Graph Convolutional NetworksIEEE Transactions on Software Engineering10.1109/TSE.2024.350372351:1(172-191)Online publication date: 1-Jan-2025
https://dl.acm.org/doi/10.1109/TSE.2024.3503723
Guo ZTan TLiu SLiu XLai WYang YLi YChen LDong WZhou Y(2023)Mitigating False Positive Static Analysis Warnings: Progress, Challenges, and OpportunitiesIEEE Transactions on Software Engineering10.1109/TSE.2023.332966749:12(5154-5188)Online publication date: 1-Dec-2023
https://dl.acm.org/doi/10.1109/TSE.2023.3329667
Motwani MBrun YLiu AMuccini H(2023)Understanding Why and Predicting When Developers Adhere to Code-Quality StandardsProceedings of the 45th International Conference on Software Engineering: Software Engineering in Practice10.1109/ICSE-SEIP58684.2023.00045(432-444)Online publication date: 17-May-2023
https://dl.acm.org/doi/10.1109/ICSE-SEIP58684.2023.00045
Choi YNam J(2023)WINEInformation and Software Technology10.1016/j.infsof.2022.107109155:COnline publication date: 1-Mar-2023
https://dl.acm.org/doi/10.1016/j.infsof.2022.107109
Nembhard FM. Carvalho M(2022)Conversational Code Analysis: The Future of Secure CodingCoding Theory - Recent Advances, New Perspectives and Applications10.5772/intechopen.98362Online publication date: 25-May-2022
https://doi.org/10.5772/intechopen.98362
Kang HAw KLo DDwyer MDamian DZeller A(2022)Detecting false alarms from automatic static analysis toolsProceedings of the 44th International Conference on Software Engineering10.1145/3510003.3510214(698-709)Online publication date: 21-May-2022
https://dl.acm.org/doi/10.1145/3510003.3510214
Wattanakriengkrai SThongtanunam PTantithamthavorn CHata HMatsumoto K(2022)Predicting Defective Lines Using a Model-Agnostic TechniqueIEEE Transactions on Software Engineering10.1109/TSE.2020.302317748:5(1480-1496)Online publication date: 1-May-2022
https://dl.acm.org/doi/10.1109/TSE.2020.3023177
Liu KKim DBissyande TYoo SLe Traon Y(2021)Mining Fix Patterns for FindBugs ViolationsIEEE Transactions on Software Engineering10.1109/TSE.2018.288495547:1(165-188)Online publication date: 1-Jan-2021
https://doi.org/10.1109/TSE.2018.2884955
Flynn LSnavely WKurtz Z(2021)Test Suites as a Source of Training Data for Static Analysis Alert Classifiers2021 IEEE/ACM International Conference on Automation of Software Test (AST)10.1109/AST52587.2021.00019(100-108)Online publication date: May-2021
https://doi.org/10.1109/AST52587.2021.00019
Tu HMenzies TGrundy J(2021)FRUGALProceedings of the 36th IEEE/ACM International Conference on Automated Software Engineering10.1109/ASE51524.2021.9678617(394-406)Online publication date: 15-Nov-2021
https://dl.acm.org/doi/10.1109/ASE51524.2021.9678617
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten