skip to main content
10.1145/3238147.3238200acmconferencesArticle/Chapter ViewAbstractPublication PagesaseConference Proceedingsconference-collections
research-article

Semantic crash bucketing

Published: 03 September 2018 Publication History

Abstract

Precise crash triage is important for automated dynamic testing tools, like fuzzers. At scale, fuzzers produce millions of crashing inputs. Fuzzers use heuristics, like stack hashes, to cut down on duplicate bug reports. These heuristics are fast, but often imprecise: even after deduplication, hundreds of uniquely reported crashes can still correspond to the same bug. Remaining crashes must be inspected manually, incurring considerable effort. In this paper we present Semantic Crash Bucketing, a generic method for precise crash bucketing using program transformation. Semantic Crash Bucketing maps crashing inputs to unique bugs as a function of changing a program (i.e., a semantic delta). We observe that a real bug fix precisely identifies crashes belonging to the same bug. Our insight is to approximate real bug fixes with lightweight program transformation to obtain the same level of precision. Our approach uses (a) patch templates and (b) semantic feedback from the program to automatically generate and apply approximate fixes for general bug classes. Our evaluation shows that approximate fixes are competitive with using true fixes for crash bucketing, and significantly outperforms built-in deduplication techniques for three state of the art fuzzers.

References

[2]
https://github.com/google/ossfuzz. Online; accessed 26 April 2018. 2018.
[3]
https://www.cert.org/vulnerabilityanalysis/tools/bffdownload.cfm. Online; accessed 26 April, 2018. 2018.
[4]
https://github.com/google/honggfuzz. Online; accessed 26 April, 2018. 2018.
[5]
https://cve.mitre.org/. Online; accessed 26 April, 2018. 2018.
[6]
https://lcamtuf.blogspot.com/2015/04/findingbugsinsqliteeasyway. html. Online; accessed 26 April, 2018. 2018.
[7]
https://access.redhat.com/security/securityupdates/#/cve. Online; accessed 26 April, 2018. 2018. AFL-Fuzz. http://lcamtuf.coredump.cx/afl/. Online; accessed 26 April, 2018. 2018. CVE-2017-12762.
[8]
https://patchwork.kernel.org/patch/9880041/. Online; accessed 26 April, 2018. 2018.
[9]
Microsoft Security Risk Detection. https://www.microsoft.com/enus/ securityriskdetection/. Online; accessed 26 April, 2018.
[10]
Semantic Crash Bucketing ASE ’18, September 3–7, 2018, Montpellier, France 2018.
[11]
Public Vulnerabilities Discovered Using BFF. https://vuls.cert.org/ confluence/display/tools/Public+Vulnerabilities+Discovered+Using+BFF. Online; accessed 26 April, 2019.
[12]
Mohammad Amin Alipour, Alex Groce, Rahul Gopinath, and Arpit Christi. 2016.
[13]
Generating focused random tests using directed swarm testing. In International Symposium on Software Testing and Analysis (ISSTA ’16). 70–81.
[14]
Thanassis Avgerinos, Alexandre Rebert, Sang Kil Cha, and David Brumley. 2014.
[15]
Enhancing symbolic execution with veritesting. In International Conference on Software Engineering (ICSE ’14). 1083–1094.
[16]
Satish Chandra, Emina Torlak, Shaon Barman, and Rastislav Bodík. 2011. Angelic debugging. In International Conference on Software Engineering (ICSE ’11). 121– 130.
[17]
Yang Chen, Alex Groce, Chaoqiang Zhang, Weng-Keen Wong, Xiaoli Fern, Eric Eide, and John Regehr. 2013. Taming compiler fuzzers. In Conference on Programming Language Design and Implementation (PLDI ’13). 197–208.
[18]
Holger Cleve and Andreas Zeller. 2005. Locating causes of program failures. In International Conference on Software Engineering (ICSE ’05). 342–351.
[19]
Zack Coker and Munawar Hafiz. 2013. Program transformations to fix C integers. In International Conference on Software Engineering (ICSE ’13). 792–801.
[20]
Weidong Cui, Marcus Peinado, Sang Kil Cha, Yanick Fratantonio, and Vasileios P Kemerlis. 2016. RETracer: Triaging crashes by reverse execution from partial memory dumps. In International Conference on Software Engineering (ICSE ’16). 820–831.
[21]
Yingnong Dang, Rongxin Wu, Hongyu Zhang, Dongmei Zhang, and Peter Nobel. 2012. ReBucket: A method for clustering duplicate crash reports based on call stack similarity. In International Conference on Software Engineering (ICSE ’12). 1084–1093.
[22]
Vinod Ganapathy, Somesh Jha, David Chandler, David Melski, and David Vitek. 2003. Buffer overrun detection using linear programming and static analysis. In Conference on Computer and Communications Security (CCS ’03). 345–354.
[23]
Patrice Godefroid and Daniel Luchaup. 2011. Automatic partial loop summarization in dynamic test generation. In International Symposium on Software Testing and Analysis (ISSTA ’11). 23.
[24]
Denis Gopan, Evan Driscoll, Ducson Nguyen, Dimitri Naydich, Alexey Loginov, and David Melski. 2015. Data-Delineation in Software Binaries and its Application to Buffer-Overrun Discovery. In International Conference on Software Engineering (ICSE ’15). 145–155.
[25]
Rahul Gopinath, Carlos Jensen, and Alex Groce. 2017. The Theory of Composite Faults. In International Conference on Software Testing, Verification (ICST ’17). 47–57.
[26]
Alex Groce, Chaoqiang Zhang, Eric Eide, Yang Chen, and John Regehr. 2012.
[27]
Swarm testing. In International Symposium on Software Testing and Analysis (ISSTA ’12). 78–88.
[28]
Brian Hackett, Manuvir Das, Daniel Wang, and Zhe Yang. 2006. Modular checking for buffer overflows in the large. In International Conference on Software Engineering (ICSE ’06). 232–241.
[29]
James A. Jones and Mary Jean Harrold. 2005. Empirical evaluation of the tarantula automatic fault-localization technique. In International Conference on Automated Software Engineering (ASE ’05). 273–282.
[30]
Dongsun Kim, Jaechang Nam, Jaewoo Song, and Sunghun Kim. 2013.
[31]
Automatic patch generation learned from human-written patches. In International Conference on Software Engineering (ICSE ’13). 802–811.
[32]
Shuvendu K. Lahiri, Rohit Sinha, and Chris Hawblitzel. 2015. Automatic Rootcausing for Program Equivalence Failures in Binaries. In Computer Aided Verification (CAV ’15). 362–379.
[33]
David Larochelle and David Evans. 2001.
[34]
Statically Detecting Likely Buffer Overflow Vulnerabilities. In USENIX Security Symposium.
[35]
Claire Le Goues, Michael Dewey-Vogt, Stephanie Forrest, and Westley Weimer. 2012. A Systematic Study of Automated Program Repair: Fixing 55 out of 105 Bugs for $8 Each. In International Conference on Software Engineering (ICSE ’12). 3–13.
[36]
Claire Le Goues, Stephanie Forrest, and Westley Weimer. 2013. Current challenges in automatic software repair. Software Quality Journal 21, 3 (2013), 421–443.
[37]
Frank Li and Vern Paxson. 2017.
[38]
A Large-Scale Empirical Study of Security Patches. In Conference on Computer and Communications Security (CCS ’17). 2201–2215.
[39]
Ben Liblit, Mayur Naik, Alice X. Zheng, Alexander Aiken, and Michael I. Jordan. 2005.
[40]
Scalable statistical bug isolation. In Programming Language Design and Implementation (PLDI ’05). 15–26.
[41]
Zhiqiang Lin, Xuxian Jiang, Dongyan Xu, Bing Mao, and Li Xie. 2007. AutoPaG: towards automated software patch generation with source code root cause identification and repair. In Symposium on Information, Computer and Communications Security. 329–340.
[42]
Fan Long and Martin Rinard. 2016.
[43]
Automatic Patch Generation by Learning Correct Code. In Principles of Programming Languages (POPL ’16). 298–31.
[44]
Fan Long, Stelios Sidiroglou-Douskos, and Martin C. Rinard. 2014. Automatic runtime error repair and containment via recovery shepherding. In Conference on Programming Language Design and Implementation (PLDI ’14). 227–238.
[45]
Sergey Mechtaev, Jooyong Yi, and Abhik Roychoudhury. 2016.
[46]
Angelix: Scalable Multiline Program Patch Synthesis via Symbolic Analysis. In International Conference on Software Engineering (ICSE ’16). 691–701.
[47]
D Molnar, XC Li, and DA Wagner. 2009. Dynamic test generation to find integer bugs in x86 binary linux programs. In USENIX Security Symposium. 67–82.
[48]
Paul Muntean, Vasantha Kommanapalli, Andreas Ibing, and Claudia Eckert. 2015.
[49]
Automated Generation of Buffer Overflow Quick Fixes Using Symbolic Execution and SMT. In Computer Safety, Reliability, and Security (SAFECOMP ’15). 441–456.
[50]
Hoang Duong Thien Nguyen, Dawei Qi, Abhik Roychoudhury, and Satish Chandra. 2013. SemFix: Program Repair via Semantic Analysis. International Conference on Software Engineering, 772–781.
[51]
Hui Peng, Yan Shoshitaishvili, and Mathias Payer. 2018.
[52]
T-Fuzz: fuzzing by program transformation. In IEEE Symposium on Security and Privacy.
[53]
Van-Thuan Pham, Sakaar Khurana, Subhajit Roy, and Abhik Roychoudhury. 2017.
[54]
Bucketing Failing Tests via Symbolic Analysis. In Fundamental Approaches to Software Engineering Conference (FASE ’17). 43–59.
[55]
Alexandre Rebert, Sang Kil Cha, Thanassis Avgerinos, Jonathan Foote, David Warren, Gustavo Grieco, and David Brumley. 2014. Optimizing Seed Selection for Fuzzing. In USENIX Security Symposium. 861–875.
[56]
Manos Renieris and Steven P. Reiss. 2003.
[57]
Fault Localization With Nearest Neighbor Queries. In International Conference on Automated Software Engineering (ASE ’03). 30–39.
[58]
Martin C Rinard, Cristian Cadar, Daniel Dumitran, Daniel M Roy, Tudor Leu, and William S Beebee. 2004. Enhancing Server Availability and Security Through Failure-Oblivious Computing. In OSDI, Vol. 4. 21–21.
[59]
Kostya Serebryany. 2017. OSS-Fuzz-Google’s continuous fuzzing service for open source software. In USENIX Security Symposium.
[60]
Mauricio Soto, Ferdian Thung, Chu-Pan Wong, Claire Le Goues, and David Lo. 2016. A deeper look into bug fixes: patterns, replacements, deletions, and additions. In International Conference on Mining Software Repositories (MSR ’16). 512–515.
[61]
Westley Weimer. 2006. Patches as better bug reports. In Generative Programming and Component Engineering (GPCE ’06). 181–190.
[62]
Maverick Woo, Sang Kil Cha, Samantha Gottlieb, and David Brumley. 2013.
[63]
Scheduling Black-box Mutational Fuzzing. In Conference on Computer & Communications Security (CCS ’13). 511–522.

Cited By

View all
  • (2023)A Survey on Bug Deduplication and Triage Methods from Multiple Points of ViewApplied Sciences10.3390/app1315878813:15(8788)Online publication date: 29-Jul-2023
  • (2023)Acto: Automatic End-to-End Testing for Operation Correctness of Cloud System ManagementProceedings of the 29th Symposium on Operating Systems Principles10.1145/3600006.3613161(96-112)Online publication date: 23-Oct-2023
  • (2023)Research on the Exploitability of Binary Software Vulnerabilities2023 IEEE 12th International Conference on Cloud Networking (CloudNet)10.1109/CloudNet59005.2023.10490070(403-407)Online publication date: 1-Nov-2023
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
ASE '18: Proceedings of the 33rd ACM/IEEE International Conference on Automated Software Engineering
September 2018
955 pages
ISBN:9781450359375
DOI:10.1145/3238147
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 03 September 2018

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Automated Bug Fixing
  2. Bug Triage
  3. Crash Bucketing
  4. Fuzzing
  5. Program Transformation

Qualifiers

  • Research-article

Conference

ASE '18
Sponsor:

Acceptance Rates

Overall Acceptance Rate 82 of 337 submissions, 24%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)56
  • Downloads (Last 6 weeks)13
Reflects downloads up to 23 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2023)A Survey on Bug Deduplication and Triage Methods from Multiple Points of ViewApplied Sciences10.3390/app1315878813:15(8788)Online publication date: 29-Jul-2023
  • (2023)Acto: Automatic End-to-End Testing for Operation Correctness of Cloud System ManagementProceedings of the 29th Symposium on Operating Systems Principles10.1145/3600006.3613161(96-112)Online publication date: 23-Oct-2023
  • (2023)Research on the Exploitability of Binary Software Vulnerabilities2023 IEEE 12th International Conference on Cloud Networking (CloudNet)10.1109/CloudNet59005.2023.10490070(403-407)Online publication date: 1-Nov-2023
  • (2022)FuzzerAid: Grouping Fuzzed Crashes Based On Fault SignaturesProceedings of the 37th IEEE/ACM International Conference on Automated Software Engineering10.1145/3551349.3556959(1-12)Online publication date: 10-Oct-2022
  • (2022)Evolving Ranking-Based Failure Proximities for Better Clustering in Fault IsolationProceedings of the 37th IEEE/ACM International Conference on Automated Software Engineering10.1145/3551349.3556922(1-13)Online publication date: 10-Oct-2022
  • (2022)Fuzzing@Home: Distributed Fuzzing on Untrusted Heterogeneous ClientsProceedings of the 25th International Symposium on Research in Attacks, Intrusions and Defenses10.1145/3545948.3545971(1-16)Online publication date: 26-Oct-2022
  • (2022)DeFaultProceedings of the 44th International Conference on Software Engineering10.1145/3510003.3512760(635-646)Online publication date: 21-May-2022
  • (2022)DeepAnalyzeProceedings of the 44th International Conference on Software Engineering10.1145/3510003.3512759(549-560)Online publication date: 21-May-2022
  • (2022)One fuzzing strategy to rule them allProceedings of the 44th International Conference on Software Engineering10.1145/3510003.3510174(1634-1645)Online publication date: 21-May-2022
  • (2022)BuildSheriffProceedings of the 44th International Conference on Software Engineering10.1145/3510003.3510132(312-324)Online publication date: 21-May-2022
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media