Section: New Results
Keywords : decoding algorithms, quantum codes, linear cryptanalysis, algebraic attack, BCH codes, Groebner bases, LDPC codes, code reconstruction, reverse engineering.
Decoding techniques, algebraic systems solving and applications
Participants : Daniel Augot, Christophe Chabot, Mathieu Cluzeau, Maxime Cote, Cédric Faure, Matthieu Finiasz, Benoît Gérard, JeanPierre Tillich.
Many cryptanalyses of cryptosystems rely on approximations of these systems by simple, easier functions. For instance, one tries to approximate the system by low degree polynomials, be they in one variable over a huge finite field, or in several variables over the Boolean field. Once such an approximation has been found, the problem of finding the key or of inverting the system, which is normally intractable with a direct approach, is written into a system of simple equations, where each equation holds with some probability. The probability is as good as the approximation is close. For instance, a classical cryptanalysis of the stream ciphers which rely on linear feedback shift register filtered by a Boolean function models the attacked cipher as the result of the transmission of a linear function through a very highly noisy channel. Then, removing the noise amounts to decoding a certain linear code. This code is highly structured, and one of the most efficient methods to decode it exploits the fact that it has low density paritycheck equations, and thus can be decoded as an LDPC (Lowdensity paritycheck code)code, with iterative algorithms. Furthermore, the problem of finding such good approximations of ciphers leads also to a decoding problem. Here, finding good approximations by linear functions amounts to a decoding problem of the first order ReedMuller code. Local decoding is then used in this context, and enables various attacks, such as correlation attacks or linear cryptanalysis.
Besides the cryptographic applications of decoding algorithms, we also investigate two new application domains for decoding algorithms: reverse engineering of communication systems, and quantum error correcting codes for which we have shown that some of them can be decoded successfully with iterative decoding algorithms.
Linear cryptanalysis and decoding ReedMuller codes.
The first family of codes that we have studied in detail is the family of ReedMuller codes. Being able to decode efficiently members of this family on various channels is very helpful for cryptanalysis: the decoding of first order ReedMuller codes on the binary symmetric channel is a useful task for linear cryptanalysis whereas decoding general ReedMuller codes on the erasure channel can be used in algebraic attacks of ciphers. In particular in his thesis [75] , Cédric Tavernier found new (local) decoding algorithms for first order ReedMuller codes over the binary symmetric channel, which improves upon the GoldreichRubinfeldSudan algorithm. This algorithm enables him to find new linear approximations of several rounds of the DES with biases of the same order as Matsui's approximations.
Recent results:

Linear cryptanalysis of block ciphers: following the work by C. Tavernier, B. Gérard has explored how to improve on Matsui's linear cryptanalysis by using all these new equations. It turns out that recovering the key from these approximations is equivalent to decoding a linear code on the Gaussian channel. This relationship has been used in order to evaluate accurately how many pairs of plaintextciphertext we need in this new attack and also to suggest an algorithm based on decoding techniques for recovering the secret key in a much more efficient way than what was known before: [40] .

Generalization of the GuruswamiSudan list decoding algorithm to ReedMuller codes: [48] .
Solving algebraic systems and applications to coding.
Gröbner bases algorithms for solving algebraic systems is an important tool which can be applied both for errorcorrection and in cryptography, in the context of algebraic attacks.
Recent results:

Decoding algorithms for cyclic codes with Gröbner bases: it was demonstrated that it is possible to find decoding formulas for all cyclic codes, by a Gröbner basis offline computation. But, from the efficiency point of view, it was found that it is better to perform an online Gröbner bases computation, whose cost is reasonable. This enables to decode any cyclic code, up to their true minimum distance [67] , [70] . An improved paper has been accepted for publication in the Journal of Symbolic Computation, with computational timings for nontrivial codes, of considerable length: [11] .

D. Augot is coauthor, with E. Betti and E. Orsini of a chapter introducing cyclic codes, with their decoding algorithms, in a book devoted to Gröbner bases, coding and cryptography, in the RISC Book series: [47] .

Algebraic attacks: we have investigated some variants recent techniques for algebraic attacks, especially for stream cipher cryptanalysis: [62] , [58] , [60] .
New decoding algorithm for errorcorrection.
We also investigate more traditional aspects of coding theory by improving some decoding algorithms for errorcorrection and by searching for codes with good decoding performance.
Recent results:

Generalization of Roth and Ruckenstein's method: in 2000, a paper by Roth and Ruckenstein describes a very efficient method for implementing the Sudan decoding algorithm. During his internship, A. Zeh has successfully generalized this method to the GuruswamiSudan list decoding algorithm, where multiplicities are involved: [21] , [36] , [66] .

families of codes with good iterative decoding algorithms: this kind of codes has by now probably become the most popular coding scheme due to their exceptional performances at a reasonable algorithmic cost. We have in particular studied families of codes defined over large alphabets which are in a sense intermediate between turbocodes and LDPC codes, and have found several instances of this family whose performance are quite close to the Shannon limit [65] . This work has been supported by France Telecom: [65] .
Quantum codes.
The knowledge we have acquired in iterative decoding techniques has also lead to study whether or not the very same techniques could also be used to decode quantum codes. Part of the old ACI project “RQ” in which we were involved and the new ANR project “COCQ” are about this topic. Notice that protecting quantum information from external noise is an issue of paramount importance for building a quantum computer. It also worthwhile to notice that all quantum errorcorrecting code schemes proposed up to now suffer from the very same problem that the first (classical) errorcorrecting codes had: there are constructions of good quantum codes, but for the best of them it is not known how to decode them in polynomial time. Our approach for overcoming this problem has been to study whether or not the family of turbocodes and LDPC codes (and the associated iterative decoding algorithms) have a quantum counterpart. We have shown that the classical iterative decoding algorithms can be generalized to the quantum setting and have come up with some families of quantum LDPC codes and quantum serial turbocodes with rather good performances under iterative decoding [32] , [19] , [64] , [20] .
Reverse engineering of communication systems.
To evaluate the quality of a cryptographic algorithm, it is usually assumed that its specifications are public, as, in accordance with Kerckhoffs principle (Kerckhoffs stated that principle in a paper entitled La Cryptographie militaire , published in 1883.), it would be dangerous to rely, even partially, on the fact that the adversary does not know those specifications. However, this fundamental rule does not mean that the specifications are known to the attacker. In practice, before mounting a cryptanalysis, it is necessary to strip off the data. This reverse engineering process is often subtle, even when the data formatting is not concealed on purpose. A typical case is interception; some raw data, not necessarily encrypted, is observed out of a noisy channel. To access the information, the whole communication system has first to be disassembled and every constituent reconstructed. Our activity within this domain, whose first aim is to establish the scientific and technical foundations of a discipline which does not exist yet at an academic level, has been supported by two industrial contracts driven by the DGA.
Recent results:

M. Cluzeau and J.P. Tillich have found a lower bound on the number of codewords which have to be intercepted in order to recover the code. This lower bound turns out to be tight for several interesting code families such as LDPC codes for instance: [26] .