Section: New Results
Register Allocation and ssa Form Properties
The work presented in this section is a joint work with members of the cec team at stm icroelectronics. It is the natural extension of the work described in Section 6.2 . It deals with the following problems related with register allocation: spilling, coalescing, and splitting. Register allocation consists in allocating abstract variables to physical registers, spilling chooses which variables to store in memory when there are too few registers to store all of them, coalescing is used to regroup two variables into one when possible, hence reducing the number of variables, and splitting does the converse. Understanding the interactions between these three techniques is the heart of our work.
Register allocation Register allocation is one of the most studied problems in compilation. It is considered as an NP-complete problem since Chaitin et al., in 1981, modeled the problem of assigning temporary variables to k machine registers as the problem of coloring, with k colors, the interference graph associated to the variables. The fact that the interference graph can be arbitrary proves the NP-completeness of this formulation. However, this original proof does not really show where the complexity of register allocation comes from. Recently, the re-discovery that interference graphs of SSA programs can be colored in polynomial time raised the question: Can we use SSA to do register allocation in polynomial time, without contradicting Chaitin et al's NP-completeness result? To address such a question and, more generally, the complexity of register allocation, we have revisited Chaitin et al's proof to identify the interactions between spilling, coalescing/splitting, critical edges, and coloring. Our study shows that the NP-completeness of register allocation is not due to the coloring phase, as may suggest a misinterpretation of the reduction of Chaitin et al. from graph k-coloring. If live-range splitting is taken into account, deciding if k registers are enough or if some spilling is necessary is not as hard as one might think. The NP-completeness of register allocation is due to three factors: the presence of critical edges or not, the optimization of spilling costs (if k registers are not enough) and of coalescing costs, i.e., which live-ranges should be fused while keeping the graph k-colorable. These results have been presented at WDDD  and LCPC  .
Critical edges A critical edge is an edge of the control flow graph whose predecessor block has more than one successor and whose successor block has more than one predecessor. Forbidding live-range splitting on critical edges makes the coloring problem NP-complete. Similarly the presence of critical edges makes the translation out of SSA form difficult (see Section 6.2 ) and most existing approaches are not able to handle code with critical edges. However, some real codes (supported by some compilers) do contain critical edges that the compiler is not able to split (abnormal edges). Also, splitting an edge has a cost, and it might be preferable not to split an edge of high execution frequency. We have introduced a new technique called parallel-copy motion. Basically, the copies related to a split can be moved backward or forward the edge; a parallel-copy moved backward on the predecessor block should be compensated by a reverse parallel copy on the other outgoing edges of this block; the process has to be iterated and of course could loop and lead to an inconsistency. Determining whether this inconsistency is avoidable or not is precisely where the hardness of coloring comes from. Our experiments have shown that, in practice, for the benchmark suites available from our collaboration with stm icroelectronics, this problem can be solved in polynomial time and critical edges can be efficiently handled this way. We are currently writing a research report presenting these results.
Spilling Minimizing the amount of spilled variables is a highly-studied problem in compiler design; it is an NP-complete problem in general. Besides, an important consequence of our study (see  and the previous paragraph on register allocation) is that the goal of the spilling phase simply relies on lowering the register pressure (number of variables alive at this program point) at every program point, and the optimization problem corresponds to minimizing the spilling cost. The question raised by this important remark was: is it easier to solve the spilling problem under SSA? This would lead to register allocation heuristics by transforming a program to SSA, spilling the variables, allocating them with the exact number of registers and then trying to go out of SSA. The spilling can be considered at different granularity levels: the highest, called ``spill everywhere'', corresponds to consider the live range of each variable entirely: a spilled variable will result in a store after each definition point and a load before each use point. The finer granularity, called ``load-store optimization'', corresponds to optimize each load and store separately. This accurate formulation, also known as ``paging with write back'', is NP-complete  even for a basic block in SSA form. However, the coarser view (spill everywhere) is much simpler and can be solved in polynomial time  for a SSA basic block. What about more general graph structures? We have studied in details the spill everywhere formulation under SSA form. We have considered different instances of the problem depending on the machine model and also depending on the relative values of register pressure and number of registers. We found that most instances are NP-complete. The complexity almost always comes from the number of registers and the number of arguments by instruction (for RISC machines) but not from the register pressure. We are currently writing a report on these results.
Coalescing To finish with, the last question is related to the coalescing problem: how can we minimize or at least reduce the number of generated copies? We have distinguished several optimizations that occur in most coalescing heuristics: a) aggressive coalescing removes as many moves as possible, regardless of the colorability of the resulting interference graph; b) conservative coalescing removes as many moves as possible while keeping the colorability of the graph; c) incremental conservative coalescing removes one particular move while keeping the colorability of the graph; d) optimistic coalescing coalesces moves aggressively, then gives up about as few moves as possible so that the graph becomes colorable. We have almost completely classified the NP-completeness of these problems, discussing also on the structure of the interference graph: arbitrary, chordal, or k-colorable in a greedy fashion. These results will be presented at CGO'07  . AS for experiments, we already have promising results with heuristics that outperform previously proposed techniques.
These developments are part of two contracts (see Section 7.1 and Section 7.2 ) with the cec team at stm icroelectronics and form the theoretical foundations of our implementations in the stm icroelectronics compiler.