Section: New Results
Formal verification of compilers
The Compcert verified compiler for the C language
In the context of our work on compiler verification (see section 3.3.1 ), since 2005 we have been developing and formally verifying a moderately-optimizing compiler for a large subset of the C programming language, generating assembly code for the PowerPC and ARM architectures  . This compiler comprises a back-end part, translating the Cminor intermediate language to PowerPC assembly and reusable for source languages other than C  , and a front-end translating the Clight subset of C to Cminor  . The compiler is mostly written within the specification language of the Coq proof assistant, from which Coq's extraction facility generates executable Caml code. The compiler comes with a 40000-line, machine-checked Coq proof of semantic preservation establishing that the generated assembly code executes exactly as prescribed by the semantics of the source Clight program.
This year, we improved the Compcert C compiler in several ways. First and foremost, we added support for "goto" and labeled statements to the Clight language. While the compilation of "goto" is trivial, it was difficult to find a suitable operational semantics for this unstructured control operation. The solution was a transition semantics using continuations to represent the current control point within the program. The proofs of semantic preservation from Clight to C#minor to Cminor were entirely re-done with this new, extended semantics. Other improvements to the C compiler include:
A new optimization pass: recognition of tail calls to functions.
Small improvements in the speed and size of the generated code.
Better modularization of the compiler between the architecture-independent and the architecture-dependent parts.
Accounting for alignment constraints in the memory model.
Reduced compilation times and compile-time memory requirements.
Two versions of the Compcert development were publically released, integrating these improvements: version 1.4 in April and 1.5 in August.
Three journal articles on the Compcert project were published this year: a short general overview in the Research Highlights column of Communications of the ACM  ; a very detailed (80 pages) description of the back-end and its proof of correctness in Journal of Automated Reasoning  ; a description of the Clight source language and the mechanization of its semantics, also in Journal of Automated Reasoning  .
Verified translation validation
Verified translation validation provides an alternative to proving semantics preservation for the transformations involved in a certified compiler. Instead of proving that a given transformation is correct, we validate it a posteriori, i.e. we verify that the transformed program behaves like the original. The validation algorithm is described using the Coq proof assistant and proved correct, i.e. that it only accepts transformed programs semantically equivalent to the original. In contrast, the program transformation itself can be implemented in any language and does not need to be proved correct.
Jean-Baptiste Tristan, under the supervision of Xavier Leroy, investigated this approach in the case of software pipelining. Software pipelining is a very ambitious loop optimization that exploits instruction level parallelism between consecutive iterations of a loop. This year, Jean-Baptiste Tristan succesfully designed, implemented, and verified a validator for a software pipeliner, the key component of the software pipelining optimization. The validator is based on symbolic evaluation, here applied for the first time to the verification of a loop optimization. This result was accepted for publication at the forthcoming POPL 2010 symposium  .
Jean-Baptiste Tristan completed his Ph.D. dissertation, titled Formal Verification of Translation Validators  and successfully defended in November. The dissertation consolidates the results of four case studies of verified validation applied to advanced optimizations: lazy code motion, list scheduling, trace scheduling, and software pipelining.
Silvain Rideau, first-year student at ENS Paris supervised by Xavier Leroy, developed and proved correct another translation validator dedicated to the register allocation and spilling/reloading phases of a compiler. While state-of-the-art register allocation algorithms have been formally verified before in the context of the Compcert project   , the spilling and reloading algorithm that has been verified so far is quite naive and inappropriate for register-poor target processors such as the popular x86 architecture. Silvain Rideau designed a translation validation algorithm, based on backward dataflow analysis, that can validate a posteriori the results of register allocation with aggressive spilling and reloading strategies, including live range splitting, reuse of reloaded quantities, and coalescing. He proved the soundness of this validator using the Coq proof assistant, and validated it experimentally on a nontrivial spilling strategy involving live range splitting at each use and definition of a spilled temporary. These results were accepted for publication at the Compiler Construction 2010 conference  .
Verified compilation of object-oriented languages
Object layout and management, including dynamic allocation, field resolution, method dispatch and type casts, is a critical part of the compilation and runtime systems of object-oriented languages such as Java or C++. Formal verification of this part needs relating an abstract formalization of object operations at the level of the source language semantics with a concrete representation of objects in the memory model provided by the target low-level language. As this work heavily uses pointer arithmetic, the proofs must be treated with specific methods.
This year, under Xavier Leroy's supervision, Tahina Ramananandro tackled the issue of formally verifying object layout and management in class-based, single-inheritance languages. This is a step towards building a formally verified static compiler from a subset of Java bytecode to RTL (a CFG-style intermediate language of the CompCert back-end). Then, Tahina Ramananandro has been extending this formalization to multiple inheritance to deal with a subset of C++ as a source language.