Section: Scientific Foundations
Floating-point numbers are represented by triplets ( s, n, e) associated with
where is the radix of the system. In practice, = 2 or = 10 , however, studying the system independently of the value of allows a better understanding of its behaviour. An arithmetic operator handling floating-point numbers is more complex than the same operator restricted to integer numbers. It is necessary to correctly round the operation with one of the four rounding modes proposed by the IEEE-754 standard (this standard specifies the formats of the numbers and the arithmetic operations), to handle at the same time the mantissa and the exponent of the operands, and to deal with the various cases of exception (infinite, subnormal numbers, etc).
Formal Specifications and Proofs
Very mediatized problems (the Pentium bug, or the fact that 2001!/2000! = 1 in Maple 7) show that arithmetic correctness is sometimes difficult to obtain on a computer. Few tools handle rigorous proofs on floating-point data. However, thanks to the IEEE-754 standard, the arithmetic operations are completely specified, which makes it possible to build proofs of algorithms and properties. But it is difficult to present a proof including the long list of special cases generated by these calculations. The formalization of the standard, begun with our collaboration with the Lemme project (ARC AOC) in year 2000, makes it possible to use a proof assistant such as Coq  to guarantee that each particular case is considered and handled correctly. Thanks to funding from CNRS and NASA, the same specification is now also available in PVS.
Systems such as Coq and PVS make it possible to define new objects and to derive formal consequences of these definitions. Thanks to higher order logic, we establish properties in a very general form. The proof is built in an interactive way by guiding the assistant with high level tactics. At the end of each proof, Coq builds an internal object, called a proof term, which contains all the details of derivations and guarantees that the theorem is valid. PVS is usually considered less reliable because it builds no proof term.
Elementary Functions and Correct Rounding
Many libraries for elementary functions are currently available. The functions in question are typically those defined by the C99 standard, and are offered by vendors of processors, compilers or operating systems. The majority of these libraries attempts to reproduce the mathematical properties of the given functions: monotony, symmetries and sometimes range.
Concerning the correct rounding of the result, it is not required by the IEEE-754 standard: during the elaboration of this standard, it was considered that correctly rounded elementary functions was impossible to obtain at a reasonable cost, because of the so called Table Maker's Dilemma : an elementary function is evaluated to some internal accuracy (usually higher than the target precision), and then rounded to the target precision. What is the accuracy necessary to ensure that rounding this evaluation is equivalent to rounding the exact result, for all possible inputs? The answer to this question is generally unknown, which means that correctly rounding elementary functions requires arbitrary multiple-precision, which is very slow and resource-consuming.
Indeed, correctly rounded libraries already exist, such as MPFR ( http://www.mpfr.org ), the Accurate Portable Library released by IBM in 2002, or the libmcr library, released by Sun Microsystems in late 2004. However they have worst-case execution time and memory consumption up to 10,000 worse than usual libraries, which is the main obstacle to their generalized use.
We have focussed in previous years on computing bounds on the intermediate precision required for correctly rounding some elementary functions in IEEE-754 double precision. This allows us to design algorithms using a large but fixed precision instead of arbitrary multiple-precision. That makes it possible to offer the correct rounding with an acceptable overhead: we have experimental code where the cost of correct rounding is negligible in average, and less than a factor 10 in the worst case. It also enables to prove the correct-rounding property, and to prove bounds on the worst-case performance of our functions. This proof concern is mostly absent from IBM's and Sun's libraries, and indeed we have found many misrounded values in each of them.
The design of a library with correct rounding also requires the study of algorithms in large (but not arbitrary) precision, as well as the study of more general methods for the three stages of the evaluation of elementary functions: argument reduction, approximation, and reconstruction of the result.