## Section: Scientific Foundations

### Efficient algorithmics for code coupling in complex simulations

Participants : Olivier Coulaud, Aurélien Esnard, Damien Genet, Nicolas Richart, Jean Roman, Jérôme Soumagne.

Many important physical phenomena in material physics and climate modelling are inherently complex applications. They often use multiphysics or multiscale approaches, that couple different models and codes. There is typically one model per different scale or physics; and each model is implemented by a parallel code. For instance, to model crack propagation, one uses two scales: an atomistic model and a continuum model discretized by a finite element method. These phenomena are simulated by coupling different parallel codes such as molecular dynamic code and elasticity code.

The experience that we have acquired in the `ScAlApplix` project through
the activities in crack propagation simulations with LibMultiScale and
in M-by-N computational steering (coupling simulation with parallel
visualization tools) with `EPSN` shows us that if the model aspect was well
studied, several problems in parallel or distributed algorithms are
still open and not well studied.
In the context of code coupling in `HiePACS` , we want to contribute more
precisely to the following points.

#### Efficient schemes for multiscale simulations

As mentioned previously, many important physical phenomena, such as material deformation and failure (see Section 4.2 ), are inherently multiscale processes that cannot always be modeled via continuum model. Fully microspcopic simulations of most domains of interest are not computationally feasible. Therefore, researchers must look at multiscale methods that couple micro models and macro models. Combining different scales such as quantum-atomistic or atomistic, mesoscale and continuum, are still a challenge to obtain efficient and accurate schemes that efficiently and effectively exchange information between the different scales. We are currently involved in two national research projects (ANR), that focus on multiscale schemes. More precisely, the models that we start to study are the quantum to atomic coupling (QM/MM coupling) in the NOSSI ANR and the atomic to dislocation coupling in the OPTIDIS ANR (proposal for the 2010 COSINUS call of the French ANR).

#### Coupling of complex simulations based on the hypergraph model

The performance of the coupled codes depends on how well the
data are distributed among the processors. Generally, the data
distributions of each code are built independently from each other
to obtain the best load-balancing. But once the codes are coupled,
the naive use of these decompositions can lead to important
imbalance in particular, when we have an overlap zone beetwen the
different models. Therefore, the modelling of the coupling itself
is crucial to improve the performance and to ensure a good
scalability of the coupled codes. The goal here is to find the
best data distribution for the whole coupled code and not only for
each standalone code.
The main idea is to use an hypergraph model as the one provided by
the `ZOLTAN` toolkit,
and to take into account more information in the coupling than the
classical one used by graph partitionner. Indeed, in the
hypergraph model, the hyperedge cuts accurately measure
communication volume, while, in the graph model, the edge cuts
only approximate the communication volume. Moreover, recent works
on hypergraph partitioning with fixed
vertices have demonstrated their
effectiveness for dynamic load balancing of adaptative
simulations. As the load balancing problem is quite close to the
redistribution one, we expect to provide new redistribution
algorithm using similar strategies. For example, we should add in
the communication cost the redistribution cost between codes (that
depends on the volume of data exchanged); and we should add in the
computation cost the interpolation cost, and so on.
In addition, we expect the greater expressiveness of the
hypergraph model help us to model each individual simulation code
more accurately and thus enables us to improve their scalability
thanks to a better partition quality.
Another connected problem is the problem of resource
allocation. This is particularly important for the global coupling
efficiency and scalabilty, because each code involved in the
coupling can be more or less computationally intensive, and there
is a good trade-off to find between resources assigned to codes to
avoid idle time. Typically, if we have
a given number of processors and two coupled codes, how to split
the processors among each code?

#### Steering and Interacting with complex coupled simulations

The computational steering is an effort to make the typical simulation work-flow (modelling, computing, analyzing) more efficient, by providing online visualization and interactive steering over the on-going computational processes. The online visualization appears very useful to monitor and to detect possible errors in long-running applications, and the interactive steering allows the researcher to alter simulation parameters on-the-fly and to immediately receive feedback on their effects. Thus, the scientist gains an additional insight in the simulation regarding to the cause-and-effect relationship.

In the `ScAlApplix` project, we have studied this problem in the
case where both the simulation and the visualization can be
parallel, what we call M-by-N computational
steering, and we have
developed a software environment called `EPSN` (see
Section
5.3 ). More recently, we
have proposed a model for the steering of complex coupled
simulations and one important conclusion we have from these previous works is
that the steering problem can be conveniently modeled as a
coupling problem between one or more parallel simulation codes and
one visualization code, that can be parallel as well. We propose
in `HiePACS` to revisit the steering problem as a
coupling problem and we expect to reuse the new redistribution
algorithms developped in the context of code coupling for the
purpose of M-by-N steering.

In several applications, it is often very useful either to visualize the results of the ongoing simulation before writing it to disk, or to steer the simulation by modifying some parameters and visualize the impact of these modifications interactively. Nowadays, high performance computing simulations use many computing nodes, that perform I/O using the widely used HDF5 file format. One of the problems is now to use real-time visualization using high performance computing. In that respect we need to efficiently combine very large parallel simulation systems with parallel visualization systems. The originality of this approach is the use of the HDF5 file format to write in a distributed shared memory (DSM); so that the data can be read from the upper part of the visualization pipeline. This leads to define a relevant steering model based on a DSM. It implies finding a way to write/read data efficiently in this DSM, and steer the simulation. This work is developed in collaboration with the Swiss National Supercomputing Centre (CSCS). As concerns the interaction aspect, we are interested in providing new mechanisms to interact with the simulation directly through the visualization. For instance in the ANR NOSSI, in order to speed up the computation we are interested in rotating a molecule in a cavity or in moving it from one cavity to another within the crystal latice. To perform safely such interactions a model of the interaction in our steering framework is necessary to keep the data coherency in the simulation. Another point we plan to study is the monitoring and interaction with ressources, in order to perform user-directed checkpoint/restart or user-directed load balancing at runtime.