## Section: New Results

### Exact and Approximated Data-Reuse Optimizations for Tiling with Parametric Sizes

Participants : Alain Darte, Alexandre Isoard.

As mentioned in Section 7.6 , loop tiling is a loop transformation widely used to improve spatial and temporal data locality, to increase computation granularity, and to enable blocking algorithms, which are particularly useful when offloading kernels on computing units with smaller memories. When caches are not available or used, data transfers and local storage must be software-managed, and some useless remote communications can be avoided by exploiting data reuse between tiles. An important parameter of tiling is the sizes of the tiles, which impact the size of the required local memory. However, for most analyses involving several tiles, which is the case for inter-tile data reuse, the tile sizes induce non-linear constraints, unless they are numerical constants. This complicates or prevents a parametric analysis with polyhedral optimization techniques.

We showed that, when tiles are executed in sequence along tile axes, the
parametric (with respect to tile sizes) analysis for inter-tile data reuse is
nevertheless possible, i.e., one can determine, at compile-time and in a
parametric fashion, the copy-in and copy-out data sets for all tiles, with
inter-tile reuse, as well as sizes for the induced local memories (this is also
connected to the liveness analysis described in
Section
7.12 ). When approximations of
transfers are performed, the situation is much more complex, and involves a
careful analysis to guarantee correctness when data are both read and written.
We provide the mathematical foundations to make such approximations possible,
thanks to the introduction of the concept of *pointwise functions*.
Combined with hierarchical tiling, this result opens perspectives for the
automatic generation of blocking algorithms, guided by parametric cost models,
where blocks can be pipelined and/or can contain parallelism. Previous work on
FPGAs and GPUs already showed the interest and feasibility of such automation
with tiling, but in a non-parametric fashion.

Our method is currently implemented with the `iscc` calculator of
`ISL` , a library for the manipulation of integer sets defined with
Presburger arithmetic, a complete implementation within the PPCG compiler is in
progress (see also Section
6.7 ).

We believe that our approximation technique can be used for other applications linked to the extension of the polyhedral model as it turns out to be fairly powerful. Our future work will be to derive efficient approximation techniques, either because the program cannot be fully analyzable, or because approximations can speed-up or simplify the results of the analysis without losing much in terms of memory transfers and/or memory sizes.

A preliminary version of this work has been presented at the IMPACT'14 workshop [19] . A revised version was published at the International Conference on Compiler Construction (CC'15) [3] .