Section: New Results
Formal verification of compilers
Verification of a compiler back-end
Participant : Xavier Leroy.
Our work on compiler verification (see section 3.3.1 ) started in 2004 and 2005 with the formal verification of a compiler back-end, translating the Cminor intermediate language down to PowerPC assembly code and performing a few optimizations (register allocation by graph coloring and constant propagation). This work is described in a paper presented at the POPL 2006 symposium  . This year, Xavier Leroy implemented and proved correct one additional optimization pass: common subexpression elimination, performed by value numbering over basic blocks. He also revised the operational semantics of all source, intermediate and target languages so that they capture a trace of all input/output activities of the program, and proved that this trace of I/O events is preserved by all the passes of the back-end. This addition of traces leads to a significantly stronger observational equivalence between source and machine code than in our previous work.
Verification of a compiler front-end for a subset of the C language
In parallel with our work on compiler back-end, we are also conducting the development and formal verification of compiler front-ends that target the Cminor intermediate language. The first such front-end generates Cminor code from a subset of the C language called Clight, similar to that used for programming critical embedded systems. Clight features all the arithmetic types and operators of C, as well as arrays, pointers, pointer arithmetic, function pointers, and all the structured control statements of C, but excludes unstructured control (goto , switch , longjmp ).
A first prototype of a verified front-end for Clight was developed in 2005 and described in a paper presented this year at the Formal Methods conference  .
As part of his Master's internship and under Sandrine Blazy's supervision, Thomas Moniot re-architected the prototype Clight front-end around the use of the CIL library developed at Berkeley  . CIL provides an industrial-strength parser and type-checker for the C language, as well as a simplifier that eliminates or explicates many difficult features of this language. The use of CIL enables our front-end to correctly handle a much larger subset of the C language, including struct and union types. Thomas Moniot and Xavier Leroy extended and adapted the Coq proofs of semantic preservation for the C front-end. As a result of this work, the Compcert compiler is now able to compile realistic examples of C source code, of a few thousand lines, with almost no modifications.
Verification of a compiler front-end for Mini-ML
As part of her PhD thesis and under Xavier Leroy's supervision, Zaynah Dargaye investigates the development and formal verification of a translator from a small call-by-value functional language (Mini-ML) to Cminor. Mini-ML functions as first-class values, arithmetic, constructed data types and shallow pattern-matching, making it an adequate target for Coq's program extraction facility.
Zaynah Dargaye developed and proved correct in Coq three translation passes: numbering of data constructors, lifting of function definitions to top-level, closure conversion and generation of Cminor code, as well as one optimization pass: the transformation of curried functions into n-ary functions. A paper describing this optimization and its proof of correctness was accepted for presentation at JFLA 2007  .
Certified translation validation
Certified translation validation provides an alternative to proving semantics preservation for the transformations involved in a certified compiler. Instead of proving that a given transformation is correct, we validate it a posteriori , i.e. we verify that the transformed program behaves like the original. The validation algorithm is described using the Coq proof assistant and proved correct, i.e. that it only accepts transformed programs semantically equivalent to the original. In contrast, the program transformation itself can be implemented in any language and does not need to be proved correct.
Jean-Baptiste Tristan, under the supervision of Xavier Leroy, is investigating this approach in the case of instruction scheduling transformations. Instruction scheduling is a family of low-level optimizations that reorder the program instructions so as to exploit instruction-level parallelism and reduce overall execution time. The validation algorithm for instruction scheduling is based on symbolic execution of the original and transformed programs  . During his Master's internship, Jean-Baptiste Tristan developed and proved correct a validator adequate for instruction scheduling at the basic-block level. As part of the beginning of his PhD, he currently works on extending validation to trace scheduling, where instructions can move around conditional branches but not loops.
Verified garbage collection
High-level languages that automate memory management, such as ML or Java, prevent a large class of dangerous bugs, and are relatively amenable to formal reasoning about programs. This makes these languages a good basis for developing high-confidence software systems, including system software and theorem provers (such as Coq). However, to trust systems built on top of garbage collection, it is necessary to trust the garbage collector itself. Although correctness of GC algorithms is a very old subject, correctness of actual implementations has not been well-studied, and indeed there are few if any fielded systems containing a formally-verified collector. The goal of this research is to fill this gap, specifically by developing a verified collector within the Compcert compiler framework.
This work is just beginning, and we are still investigating several alternative approaches. Currently, the most promising idea is to code the collector directly in the Cminor intermediate language, and use mechanized proof assistance to verify correctness of this code. The collector can then be interfaced easily with code generated from existing front ends that generate Cminor; moreover, the existing certified compilation pipeline from Cminor to machine code can be used on the collector code itself. However, so far we have little experience in proving properties of Cminor programs (as opposed to properties of the compilation system). One possible mechanism is to adopt the Caduceus C-language verification-condition generator, developed by Filliâtre and others at LRI  , to work on Cminor programs; we are undertaking experiments with the existing Caduceus to determine its suitability for verifying a collector.