skip to main content
10.1145/2556464.2556465acmconferencesArticle/Chapter ViewAbstractPublication PagespoplConference Proceedingsconference-collections
tutorial

Recovering C++ Objects From Binaries Using Inter-Procedural Data-Flow Analysis

Published: 22 January 2014 Publication History

Abstract

Object-oriented programming complicates the already difficult task of reverse engineering software, and is being used increasingly by malware authors. Unlike traditional procedural-style code, reverse engineers must understand the complex interactions between object-oriented methods and the shared data structures with which they operate on, a tedious manual process.
In this paper, we present a static approach that uses symbolic execution and inter-procedural data flow analysis to discover object instances, data members, and methods of a common class. The key idea behind our work is to track the propagation and usage of a unique object instance reference, called a this pointer. Our goal is to help malware reverse engineers to understand how classes are laid out and to identify their methods. We have implemented our approach in a tool called ObJDIGGER, which produced encouraging results when validated on real-world malware samples.

References

[1]
Aris Adamantiadis. Reversing C++ programs with IDA pro and and Hey-rays. http://blog.0xbadc0de.be/archives/67.
[2]
Gogul Balakrishnan and Thomas Reps. Divine: discovering variables in executables. In Proceedings of the 8th international conference on Verification, model checking, and abstract interpretation, VMCAI'07, pages 1--28, Berlin, Heidelberg, 2007. Springer-Verlag.
[3]
Keith D. Cooper, Timothy J. Harvey, and Ken Kennedy. Iterative data-flow analysis, revisited. Technical report, Rice University, 2004.
[4]
David Dewey and Jonathon T. Giffin. Static detection of C++ vtable escape vulnerabilities in binary code. In Proceedings of the 19th Annual Network and Distributed System Security Symposium, NDSS'12, http://www.internetsociety.org/static-detection-c-vtable-escape-vulnerabilitiesbinary-code, 2012.
[5]
Agner Fog, Technical University of Denmark. Calling conventions for different C++ compilers and operating systems. http://www.agner.org/optimize/calling_conventions.pdf, pages 16--17, Last Updated 04-09-2013.
[6]
Alexander Fokin, Katerina Troshina, and Alexander Chernov. Reconstruction of Class Hierarchies for Decompilation of C++ Programs. In Proceedings of the 14th European Conference on Software Maintenance and Reengineering (CSMR'10), IEEE, pages 240--243, 2010.
[7]
Alexander Fokin, Egor Derevenetc, Alexander Chernov, and Katerina Troshina. SmartDec: Approaching C++ Decompilation. In Proceedings of the 18th Working Conference on Reverse Engineering, WCRE'11, pages 347--356, 2011.
[8]
Jan Gray. C++: Under the Hood. http://www.openrce.org/articles/files/jangrayhood.pdf, 1994.
[9]
S. Horwitz, T. Reps, and D. Binkley. Interprocedural Slicing Using Dependence Graphs. In Proceedings of the ACM SIGPLAN 1988 Conference on Programming Language Design and Implementation (PLDI'88), pages 35--46, 1988.
[10]
Harold Johnson. Data flow analysis for `intractable' system software. In SIGPLAN Symposium on Compiler Construction, pages 109--117, 1986.
[11]
James C. King. Symbolic Execution and Program Testing. Communications of the ACM (CACM), 19(7), July 1976.
[12]
Ákos Kiss, Judit Jász, and Tibor Gyimóthy. Using dynamic information in the interprocedural static slicing of binary executables. Software Quality Control, 13(3):227--245, September 2005.
[13]
JongHyup Lee, Thanassis Avgerinos, and David Brumley. Tie: Principled reverse engineering of types in binary programs. In NDSS. The Internet Society, 2011.
[14]
Z. Lin, X. Zhang, and D. Xu. Automatic Reverse Engineering of Data Structures from Binary Execution. In Proceedings of the Network and Distributed System Security Symposium (NDSS'2010), March 2010.
[15]
Dan Quinlan. ROSE: Compiler support for object-oriented frameworks. In Parallel Processing Letters 10, no. 02n03, pages 215--226. 2000.
[16]
G. Ramalingam, John Field, and Frank Tip. Aggregate structure identification and its application to program analysis. In Proceedings of the 26th ACM SIGPLAN-SIGACT symposium on Principles of programming languages, POPL '99, pages 119--132, New York, NY, USA, 1999. ACM.
[17]
ROSE website. http://www.rosecompiler.org.
[18]
Paul Vincent Sabanal and Mark Vincent Yason. Reversing C++. http://www.blackhat.com/presentations/bh-dc-07/Sabanal_Yason/Paper/bh-\dc-07-Sabanal_Yason-WP.pdf.
[19]
Asia Slowinska, Traian Stancescu, and Herbert Bos. Dde: dynamic data structure excavation. In Proceedings of the first ACM asia-pacific workshop on Workshop on systems, APSys '10, pages 13--18, New York, NY, USA, 2010. ACM.
[20]
V.K. Srinivasan and T. Reps. Software Architecture Recovery from Machine Code. Technical Report TR1781, University of Wisconsin - Madison, March 2013. http://digital.library.wisc.edu/1793/65091.
[21]
Jens Tröger, and Cristina Cifuentes. Analysis of Virtual Method Invocation for Binary Translation. In Proceedings of the 9th Working Conference on Reverse Engineering (WCRE '02), IEEE Computer Society, pages 65--, 2002.

Cited By

View all
  • (2024)BaseMirror: Automatic Reverse Engineering of Baseband Commands from Android's Radio Interface LayerProceedings of the 2024 on ACM SIGSAC Conference on Computer and Communications Security10.1145/3658644.3690254(2311-2325)Online publication date: 2-Dec-2024
  • (2024)Abstraction of Design Information from Legacy C++ Program2024 Third International Conference on Distributed Computing and Electrical Circuits and Electronics (ICDCECE)10.1109/ICDCECE60827.2024.10549279(01-07)Online publication date: 26-Apr-2024
  • (2023)Egg hunt in Tesla infotainmentProceedings of the 32nd USENIX Conference on Security Symposium10.5555/3620237.3620461(3997-4014)Online publication date: 9-Aug-2023
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
PPREW'14: Proceedings of ACM SIGPLAN on Program Protection and Reverse Engineering Workshop 2014
January 2014
69 pages
ISBN:9781450326490
DOI:10.1145/2556464
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

In-Cooperation

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 22 January 2014

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Tutorial
  • Research
  • Refereed limited

Conference

POPL '14
Sponsor:

Acceptance Rates

PPREW'14 Paper Acceptance Rate 6 of 10 submissions, 60%;
Overall Acceptance Rate 21 of 36 submissions, 58%

Upcoming Conference

POPL '26

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)52
  • Downloads (Last 6 weeks)2
Reflects downloads up to 13 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2024)BaseMirror: Automatic Reverse Engineering of Baseband Commands from Android's Radio Interface LayerProceedings of the 2024 on ACM SIGSAC Conference on Computer and Communications Security10.1145/3658644.3690254(2311-2325)Online publication date: 2-Dec-2024
  • (2024)Abstraction of Design Information from Legacy C++ Program2024 Third International Conference on Distributed Computing and Electrical Circuits and Electronics (ICDCECE)10.1109/ICDCECE60827.2024.10549279(01-07)Online publication date: 26-Apr-2024
  • (2023)Egg hunt in Tesla infotainmentProceedings of the 32nd USENIX Conference on Security Symposium10.5555/3620237.3620461(3997-4014)Online publication date: 9-Aug-2023
  • (2022)Recovering container class types in C++ binariesProceedings of the 20th IEEE/ACM International Symposium on Code Generation and Optimization10.1109/CGO53902.2022.9741274(131-143)Online publication date: 2-Apr-2022
  • (2021)Program Obfuscation via ABI DebiasingProceedings of the 37th Annual Computer Security Applications Conference10.1145/3485832.3488017(146-157)Online publication date: 6-Dec-2021
  • (2021)AutoProfile: Towards Automated Profile Generation for Memory AnalysisACM Transactions on Privacy and Security10.1145/348547125:1(1-26)Online publication date: 23-Nov-2021
  • (2021)StateFormer: fine-grained type recovery from binaries using generative state modelingProceedings of the 29th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering10.1145/3468264.3468607(690-702)Online publication date: 20-Aug-2021
  • (2021)Optimizing demand‐driven null dereference verification via merging branchesExpert Systems10.1111/exsy.1270739:6Online publication date: 27-May-2021
  • (2020)Devil is Virtual: Reversing Virtual Inheritance in C++ BinariesProceedings of the 2020 ACM SIGSAC Conference on Computer and Communications Security10.1145/3372297.3417251(133-148)Online publication date: 30-Oct-2020
  • (2019)VPSProceedings of the 35th Annual Computer Security Applications Conference10.1145/3359789.3359797(97-112)Online publication date: 9-Dec-2019
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media