skip to main content
10.1145/3320269.3384766acmconferencesArticle/Chapter ViewAbstractPublication Pagesasia-ccsConference Proceedingsconference-collections
research-article

A Comb for Decompiled C Code

Published: 05 October 2020 Publication History

Abstract

Decompilers are fundamental tools to perform security assessments of third-party software. The quality of decompiled code can be a game changer in order to reduce the time and effort required for analysis. This paper proposes a novel approach to restructure the control flow graph recovered from binary programs in a semantics-preserving fashion. The algorithm is designed from the ground up with the goal of producing C code that is both goto-free and drastically reducing the mental load required for an analyst to understand it. As a result, the code generated with this technique is well-structured, idiomatic, readable, easy to understand and fully exploits the expressiveness of C language. The algorithm has been implemented on top of the revng static binary analysis framework. The resulting decompiler, revngc, is compared on real-world binaries with state-of-the-art commercial and open source tools. The results show that our decompilation process introduces between 40% and 50% less extra cyclomatic complexity.

References

[1]
Hex-rays decompiler. https://www.hex-rays.com/products/decompiler/.
[2]
National Security Agency. Ghidra. https://ghidra-sre.org/.
[3]
Fabrice Bellard. textscQEMU, a fast and portable dynamic translator. In Proceedings of the FREENIX Track: 2005 USENIX Annual Technical Conference, April 10--15, 2005, Anaheim, CA, USA, 2005.
[4]
David Brumley, Ivan Jager, Thanassis Avgerinos, and Edward J Schwartz. Bap: A binary analysis platform. In International Conference on Computer Aided Verification. Springer, 2011.
[5]
David Brumley, JongHyup Lee, Edward J Schwartz, and Maverick Woo. Native x86 decompilation using semantics-preserving structural analysis and iterative control-flow structuring. In Presented as part of the 22nd USENIX Security Symposium (USENIX Security 13), 2013.
[6]
David Brumley, JongHyup Lee, Edward J. Schwartz, and Maverick Woo. Native x86 decompilation using semantics-preserving structural analysis and iterative control-flow structuring. In Proceedings of the 22th USENIX Security Symposium, Washington, DC, USA, August 14--16, 2013, 2013.
[7]
Cristina Cifuentes. Reverse compilation techniques. Queensland University of Technology, Brisbane, 1994.
[8]
Alessandro Di Federico and Giovanni Agosta. A jump-target identification method for multi-architecture static binary translation. In Compliers, Architectures, and Sythesis of Embedded Systems (CASES), 2016 International Conference on, 2016.
[9]
Alessandro Di Federico, Mathias Payer, and Giovanni Agosta. rev. ng: a unified binary analysis framework to recover cfgs and function boundaries. In Proceedings of the 26th International Conference on Compiler Construction, 2017.
[10]
Edsger W Dijkstra. Go to statement considered harmful. Communications of the ACM, 11(3), 1968.
[11]
Alessandro Di Federico, Pietro Fezzardi, and Giovanni Agosta. textttrev.ng: A multi-architecture framework for reverse engineering and vulnerability discovery. In International Carnahan Conference on Security Technology, ICCST 2018, Montréal, Canada, October 22--25, 2018. IEEE, 2018.
[12]
Ilfak Guilfanov. Decompilers and beyond. Black Hat USA, 2008.
[13]
Hex-Rays. Ida pro. https://www.hex-rays.com/products/ida/.
[14]
Donald E. Knuth. The Art of Computer Programming, Volume 1 (3rd Ed.): Fundamental Algorithms. Addison Wesley Longman Publishing Co., Inc., Redwood City, CA, USA, 1997.
[15]
Christopher Kruegel, William Robertson, Fredrik Valeur, and Giovanni Vigna. Static disassembly of obfuscated binaries. In USENIX security Symposium, volume 13, 2004.
[16]
Chris Lattner and Vikram Adve. LLVM: A Compilation Framework for Lifelong Program Analysis & Transformation. In CGO 2004.
[17]
Thomas Lengauer and Robert Endre Tarjan. A fast algorithm for finding dominators in a flowgraph. ACM Transactions on Programming Languages and Systems (TOPLAS), 1979.
[18]
T. J. McCabe. A complexity measure. IEEE Transactions on Software Engineering, SE-2(4), Dec 1976.
[19]
Dawn Song, David Brumley, Heng Yin, Juan Caballero, Ivan Jager, Min Gyung Kang, Zhenkai Liang, James Newsome, Pongsin Poosankam, and Prateek Saxena. Bitblaze: A new approach to computer security via binary analysis. In International Conference on Information Systems Security. Springer, 2008.
[20]
Khaled Yakdan. Dream code snippets. https://net.cs.uni-bonn.de/fileadmin/ag/martini/Staff/yakdan/code_snippets_ndss_2015.pdf.
[21]
Khaled Yakdan, Sergej Dechand, Elmar Gerhards-Padilla, and Matthew Smith. Helping johnny to analyze malware: A usability-optimized decompiler and malware analysis user study. In 2016 IEEE Symposium on Security and Privacy (SP). IEEE, 2016.
[22]
Khaled Yakdan, Sebastian Eschweiler, Elmar Gerhards-Padilla, and Matthew Smith. No more gotos: Decompilation using pattern-independent control-flow structuring and semantic-preserving transformations. In NDSS, 2015.

Cited By

View all
  • (2024)Ahoy SAILR! there is no need to DREAM of CProceedings of the 33rd USENIX Conference on Security Symposium10.5555/3698900.3698921(361-378)Online publication date: 14-Aug-2024
  • (2024)StackSightProceedings of the 41st International Conference on Machine Learning10.5555/3692070.3692591(13010-13028)Online publication date: 21-Jul-2024
  • (2024)"Len or index or count, anything but v1": Predicting Variable Names in Decompilation Output with Transfer Learning2024 IEEE Symposium on Security and Privacy (SP)10.1109/SP54263.2024.00152(4069-4087)Online publication date: 19-May-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
ASIA CCS '20: Proceedings of the 15th ACM Asia Conference on Computer and Communications Security
October 2020
957 pages
ISBN:9781450367509
DOI:10.1145/3320269
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 05 October 2020

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. control flow restructuring
  2. decompilation
  3. goto
  4. reverse engineering

Qualifiers

  • Research-article

Conference

ASIA CCS '20
Sponsor:

Acceptance Rates

Overall Acceptance Rate 418 of 2,322 submissions, 18%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)78
  • Downloads (Last 6 weeks)4
Reflects downloads up to 17 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Ahoy SAILR! there is no need to DREAM of CProceedings of the 33rd USENIX Conference on Security Symposium10.5555/3698900.3698921(361-378)Online publication date: 14-Aug-2024
  • (2024)StackSightProceedings of the 41st International Conference on Machine Learning10.5555/3692070.3692591(13010-13028)Online publication date: 21-Jul-2024
  • (2024)"Len or index or count, anything but v1": Predicting Variable Names in Decompilation Output with Transfer Learning2024 IEEE Symposium on Security and Privacy (SP)10.1109/SP54263.2024.00152(4069-4087)Online publication date: 19-May-2024
  • (2024)Software Code De-Compilation Techniques and Approaches: A Comparative Study2024 25th International Arab Conference on Information Technology (ACIT)10.1109/ACIT62805.2024.10877024(1-4)Online publication date: 10-Dec-2024
  • (2024)FSmell: Recognizing Inline Function in Binary CodeComputer Security – ESORICS 202310.1007/978-3-031-51476-0_24(487-506)Online publication date: 11-Jan-2024
  • (2023)FunProbe: Probing Functions from Binary Code through Probabilistic AnalysisProceedings of the 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering10.1145/3611643.3616366(1419-1430)Online publication date: 30-Nov-2023
  • (2023)SPLENDID: Supporting Parallel LLVM-IR Enhanced Natural Decompilation for Interactive DevelopmentProceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 310.1145/3582016.3582058(679-693)Online publication date: 25-Mar-2023
  • (2022)The Convergence of Source Code and Binary Vulnerability Discovery -- A Case StudyProceedings of the 2022 ACM on Asia Conference on Computer and Communications Security10.1145/3488932.3497764(602-615)Online publication date: 30-May-2022
  • (2022)Looking for Criminal Intents in JavaScript Obfuscated CodeProcedia Computer Science10.1016/j.procs.2022.09.142207:C(867-876)Online publication date: 1-Jan-2022

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media