## Section: New Results

### Efficient Floating-Point Arithmetic and Applications

Participants : Nicolas Brisebarre, Claude-Pierre Jeannerod, Vincent Lefèvre, Nicolas Louvet, Jean-Michel Muller, Adrien Panhaleux, Guillaume Revy, Gilles Villard.

#### Computation of Integer Powers in Floating-point Arithmetic

Peter Kornerup (Odense University, Denmark), Christoph Lauter (Intel, USA), Vincent Lefèvre, Nicolas Louvet and Jean-Michel Muller have introduced several algorithms for accurately evaluating powers to a positive integer in floating-point arithmetic, assuming a fused multiply-add (fma) instruction is available. For bounded, yet very large values of the exponent, they aim at obtaining correctly-rounded results in round-to-nearest mode, that is, their algorithms return the floating-point number that is nearest the exact value [17] .

#### Correctly Rounded Sums

Peter Kornerup, Vincent Lefèvre, Nicolas Louvet, and Jean-Michel Muller have presented a study of some basic blocks needed in the design of floating-point summation algorithms. In particular, they have shown that among the set of the algorithms with no comparisons performing only floating-point additions/subtractions, the 2Sum algorithm introduced by Knuth is minimal, both in terms of number of operations and depth of the dependency graph. Under reasonable conditions, they have also proven that no algorithms performing only round-to-nearest additions/subtractions exist to compute the round-to-nearest sum of at least three floating-point numbers. Starting from an algorithm due to Boldo and Melquiond, they have presented new results about the computation of the correctly-rounded sum of three floating-point numbers [31] .

#### Midpoints and Exact Points of Algebraic Functions

When implementing a function f in floating-point arithmetic with correct rounding, it is important to know
if there are input floating-point values x such that f(x) is either the middle of two consecutive floating-point
numbers or a floating-point number: In the first case f(x) is said to be a *midpoint* , and in the second case
f(x) is said to be an *exact point* . In [42] , Claude-Pierre Jeannerod, Nicolas Louvet, Jean-Michel
Muller and Adrien Panhaleux have studied the midpoints and the exact points of some usual algebraic functions:
division, inversion, square root, reciprocal square root, 2D Euclidean norm and its reciprocal, and 2D normalization.
The results and the techniques presented in this paper can be used to deal with both the binary
and the decimal formats defined in the IEEE 754-2008 standard.

#### Binary Floating-point Operators for VLIW Integer Processors

In a joint work with H. Knochel and C. Monat (STMicroelectronics Compilation Expertise Center, Grenoble), C.-P. Jeannerod, G. Revy, and G. Villard [28] have extended the bivariate polynomial evaluation-based square rooting method of [46] to division. Compared to square root, the main difficulty was to automatically validate the numerical accuracy of the fast bivariate polynomial evaluation code that used to approximate the quotient. This required first to introduce a new set of approximation and evaluation error conditions that are sufficient to ensure correct rounding. Then, some efficient heuristics have been proposed to generate such evaluation codes and to validate their accuracy according to the new error conditions. Finally, a complete C implementation has been written (which is also part of FLIP 1.0 (§ 5.2 )). With the ST200 VLIW compiler the speed-up factor is almost 1.8 (compared to FLIP 0.3).

C.-P. Jeannerod and G. Revy have worked on the design and implementation of a correctly-rounded reciprocal square root operator. They proposed in [29] a high-ILP algorithm as well as an efficient rounding algorithm (for rounding to nearest even). Their implementation, which fully supports subnormals, allowed to speed up the reciprocal square root of FLIP (§ 5.2 ) by a factor of 2. This compound operator is also about twice faster than a division followed by a square root, and entails only one rounding error instead of two.

#### Exact and Approximated Error of the FMA

The fused multiply-add (FMA) instruction, specified by the IEEE 754-2008 Standard for Floating-Point Arithmetic, eases some calculations, and is already available on some current processors. Sylvie Boldo (Proval Team, Saclay) and Jean-Michel Muller have first extended an earlier work on the computation of the exact error of an FMA (by giving more general conditions and providing a formal proof). Then, they have presented a new algorithm that computes an approximation to the error of an FMA, and provide error bounds and a formal proof for that algorithm [39] .