Elsevier

Science of Computer Programming

Volume 80, Part B, 1 February 2014, Pages 440-456
Science of Computer Programming

Recovering memory access patterns of executable programs

https://doi.org/10.1016/j.scico.2012.08.002Get rights and content
Under an Elsevier user license
open archive

Abstract

This paper deals with the binary analysis of executable programs, with the goal of understanding how they access memory. It explains how to statically build a formal model of all memory accesses. Starting with a control flow graph of each procedure, well-known techniques are used to structure this graph into a hierarchy of loops in all cases. The paper shows that much more information can be extracted by performing a complete data-flow analysis over machine registers after the program has been put in static single assignment (SSA) form. By using the SSA form, registers used in addressing memory can be symbolically expressed in terms of other previously set registers. By including the loop structures in the analysis, loop indices and trip counts can also often be expressed symbolically. The whole process produces a formal model made of loops where memory accesses are linear expressions of loop counters and registers. The paper provides a quantitative evaluation of the results when applied to several dozens of SPEC benchmark programs. Because static analysis has no access to input data, the paper ends by describing a lightweight instrumentation strategy that collects at run time enough information to rebuild an exact trace. The section on applications also describes how the techniques developed in this paper can be used to perform automatic parallelization of binary code.

Highlights

► We show how to parse binary programs and build a formal model of how they use memory. ► The paper focuses on loops and access functions. It explains how to obtain linear access functions, and loops with linear bounds. ► The efficacy of the approach is evaluated on almost 40 benchmark programs. ► One immediate application is the optimization of memory tracing. ► We also describe some applications to automatic parallelization of binary code.

Keywords

Memory accesses
Binary analysis
Decompilation
Static single assignment
Memory tracing
Program skeletonization
Automatic parallelization

Cited by (0)