## Section: Scientific Foundations

### General Framework for Validation

#### Low level modeling of communications

In the context of large scale dynamic platforms, it is unrealistic to determine precisely the actual topology and the contention of the underlying network at application level. Indeed, existing tools such as Alnem [121] are very much based on quasi-exhaustive determination of interferences, and it takes several days to determine the actual topology of a platform made up of a few tens of nodes. Given the dynamism of the platforms we target, we need to rely on less sophisticated models, whose parameters can be evaluated at runtime.

Therefore, we propose to model each node by an incoming and an
outgoing bandwidth and to neglect interference that appears at the
heart of the network (Internet), in order to concentrate on local
constraints. We are currently implementing a script, based on
Iperf to
determine the achieved bit-rates for one-to-one, one-to-many and
many-to-one transfers, given the number of TCP connections, and the
maximal size of the TCP windows. The next step will be to build a
communication protocol that enforces a prescribed sharing of the
network resources. In particular, if in the optimal solution, a node
P_{0} must send data at rate x_{i}^{out} to node P_{i} and
receive data at rate y_{j}^{in} from node P_{j} , the goal
is to achieve the prescribed bitrates, provided that all capacity
constraints are satisfied at each node. Our aim is to implement
using Java RMI a protocol able to both evaluate the parameters of
our model (incoming and outgoing bandwidths) and to ensure a
prescribed sharing of communication resources.

Under this communication model, it is possible to obtain pathological results. For instance, if we consider a master-slave setting (corresponding to the distribution of independent tasks on a Volunteer Computing platform such as BOINC), the number of slaves connected to the master may be unbounded. In fact, opening simultaneously a large number of TCP connections may lead to a bad sharing of communication resources. Therefore, we propose to add a bound on the number of connexions that can be handled simultaneously by a given node. Estimating this bound is an important issue to obtain realistic communication models.

#### Simulation

Once low level modeling has been obtained, it is crucial to be able to test the proposed algorithms. To do this, we will first rely on simulation rather than direct experimentation. Indeed, in order to be able to compare heuristics, it is necessary to execute those heuristics on the same platform. In particular, all changes in the topology or in the resource performance should occur at the same time during the execution of the different heuristics. In order to be able to replicate the same scenario several times, we need to rely on simulations. Moreover, the metric we have tentatively defined for providing approximation results in the case of dynamic platforms requires to compute the optimal solution at each time step, which can be done off-line if all traces for the different resources are stored. Using simulation rather than experiments can be justified if the simulator itself has been proved valid. Moreover, the modeling of communications, processing and their interactions may be much more complex in the simulator than in the model used to provide a theoretical approximation ratio, such as in SimGrid. In particular, sophisticated TCP models for bandwidth sharing have been implemented in SimGRID.

At a higher level, the derivation of realistic models for large
scale platforms is out of the scope of our project. Therefore, in order to obtain traces and models,
we will collaborate with MESCAL, GANG and ASAP projects. We already
worked on these topics with the members of GANG in the ACI
Pair-A-Pair (ACI Pair-A-Pair finished in 2006, but ANR Aladdin Programme Blanc acts as a follow-up, with the members of GANG and Cepage projects). On the other hand, we also need to rely on an
efficient simulator in order to test our algorithms. We have not yet
chosen the discrete event simulator we will use for simulations. One
attractive possibility would be to adapt SimGRID, developed in the
Mescal project, to large scale dynamic environments. Indeed, a
parallel version of SimGrid, based on activations is currently under
development in the framework of USS-Simgrid ANR Arpege project (with MESCAl, ALGORILLE and ASAP Teams). This version will be able to deal with platforms
containing more than 10^{5} resources. SimGrid has been developed by
Henri Casanova (U.C. San Diego) and Arnaud Legrand during his PhD
(under the co-supervision of O. Beaumont).

#### Practical validation and scaling

Finally, we propose several applications that will be described in detail in Section 5 . These applications cover a large set of fields (molecular dynamics, distributed storage, continuous integration, distributed databases...). All these applications will be developed and tested with an academic or industrial partner. In all these collaborations, our goal is to prove that the services that we propose in Section 6.2.2 can be integrated as steering tools in already developed software. Our goal is to assert the practical interest of the services we develop and then to integrate and to distribute them as a library for large scale computing.

In order to test our algorithms, we propose to implement these services using Java RMI. The main advantages of Java RMI in our context are the ease of use and the portability. Multithreading is also a crucial feature in order to schedule concurrent communications and it does not interfere with ad-hoc routing protocols developed in the project.

A prototype has already been developed in the project as a steering tool for molecular dynamic simulations (see Section 5.1 ). All the applications will first be tested on small scale platforms (using desktop workstations in the laboratory). Then, in order to test their scalability, we propose to implement them either on the GRID 5000 platform or the partner's platform.