Overall Objectives
Scientific Foundations
Application Domains
New Results
Other Grants and Activities

Section: New Results

Static analysis of the memory behavior of executable programs

For the last year and a half, we have been developing techniques to analyze binary code. The major goal is to find out, from a binary executable program and its libraries, how executing this program will use memory. When the program's memory behavior is characterized in some abstract (and parametrized) way, several studies can be conducted directly on the binary program, without any need of the source code, which may be totally or partially unavailable. Our own research topics have several examples of interesting facts that can be derived directly. The first example is that of “program skeletonization”, a lightweight instrumentation strategy to obtain a full memory trace (e.g., for dynamic dependence analysis, or cache simulation). Another example is that of the static parallelization of binary code, i.e., building a parallelizing compiler that works directly on executable programs and can thus handle mixed language programs, programs build with proprietary libraries, and so on. In all cases, it appeared that obtaining the memory behavior of the binary program was a central element. Existing systems were rudimentary, and the various techniques and algorithms that we had to use or develop specifically are significant enough that we have decided to promote this topic as a scientific result.

Our approach is to consider a binary program with the program structures that are commonly used in compiler techniques. The code is split into basic blocks organized into a control-flow graph, and this control-flow graph is structured as a hierarchy of loops. This can be done with well-known, textbook techniques, and many systems have used a similar approach (even though our experience suggests many of them fail to deal properly with some binary code specific artifacts, like irreducible loops). From that point, approaches vary widely, but no systematic and efficient technique seemed precise and accurate enough. We have chosen to base the rest of the analysis on the Static Single Assignment form, which has been tremendously useful in developing source level or intermediate representation level compiler optimizations. We have shown that, by not trying to be overly precise from early on, one can extract an interesting representation of a binary program from its binary code.

Extracting the memory behavior of a program means constructing a representation of the program that explicitly expresses how the program will access memory, in a manner that is amenable to detailed analysis. The hierarchy of loops provides the global structure, and an SSA-based symbolic analysis details how the individual memory accesses vary from one iteration to the other. This requires two analysis phases after the program is in SSA form. The first phase is mixing program slicing and forward substitution to express every memory address symbolically. The second phase focuses on loops and determines which registers hold values that vary linearly across loop iterations. The resulting representation is a program where memory accesses are defined by linear combinations of loop counters and parameters (specific register versions), with the latter hopefully loop invariant. This representation is actually very similar to the one used for static control programs (when applicable), and forms the basis of almost all parallelization techniques. These results have already been published [13] , and the conference program committee has invited us to prepare an extended version of this paper for publication in an international journal in 2011.

Building on this foundation, we have developed several techniques to enhance our basic strategy. The major limitation of our strategy, as just described, is that it restricts itself to address computations that happen completely inside the registers, ignoring any flow of data to and from memory. Fully characterizing this traffic being clearly out of reach (it would mean completely solving a general parametrized dependence problem), we have chosen to solve a restricted problem: separating access to the current stack-frame from the accesses to the rest of the memory. This requires a specific form of points-to analysis, that is able to determine whether a given address “points to” a location inside the current stack-frame, or to a location outside of it. In many cases this lets the system determine that two memory address expressions either cannot alias, because they point to separate portions of memory, or may alias, in which case a simple and conservative comparison of address expressions can decide whether they designate the same memory location. We solve this problem with a forward data-flow analysis. It appears that most of the time, this approach is enough to let the system track the flow of data through the stack-slots, which, in turn, provides some rough equivalent to use-def links for these stack slots. When used to derive symbolic expressions for memory addresses (as explained above), it lets the slicing process go further back, and lets the induction variable resolution apply to stack slots. The overall result is that more loops are completely understood by the decompilation process.


Logo Inria