Team AlGorille

Members
Overall Objectives
Scientific Foundations
Application Domains
Software
New Results
Other Grants and Activities
Dissemination
Bibliography

Section: New Results

Experimental Validation

Participants : Malek Cherier, Xavier Delaruelle, Ahmed Harbaoui, Emmanuel Jeannot, Martin Quinson, Christophe Thiery.

Improvement of the SimGrid tool

The goal of the SimGrid tool suite is to allow the study and development of distributed application on modern platforms. It is the result of a collaboration with Henri Casanova (Univ. of Hawaii, Manoa) and Arnaud Legrand (MESCAL team, INRIA Rhône-Alpes, France). Simulation is a common answer to the grid specific challenges such as scale and heterogeneity. SimGrid is one of the major simulator in the Grid community.

This year, Malek Cherrier was hired as an engineer on an ODL contract (Opération de Développement Logiciel – software development operation) granted by INRIA, allowing us to assess the technical issues posed by the development of the tool. We completed the port SimGrid to the Windows platforms, increasing the possible user base. We also worked on an automated testing infrastructure to ensure the software quality of the product. This step was necessary to stabilize the existing code base prior to the add of new functionalities, planed for the future.

In the same time, Christophe Thiery enhanced the simulator to add a new interface called SimDag. It allows to easily express applications modeled by DAG of tasks. Such an interface was present in SimGrid version 2, but it were not ported yet into the SimGrid 3 framework. This functionality helps the work on parallel task scheduling achieved within the team (cf. 6.2.2 ) and above.

SimGrid is freely downloadable [44] and its user base is rapidly growing, resulting in the publication of about ten publications (half of them from users not being part of the core team).

Grid Platform Discovery

Due to the changing characteristics of the Grids, distributed applications targeting these platforms must be network-aware and react to the condition changes. To make this possible, applications must have a synthetic view of the network condition they experiment. Several platform monitoring tools exist, but they provide irrelevant or incomplete information to network-aware applications. Most of these tools intend to help the network administrator to detect abnormalities in their system. They thus concentrate on very low level metrics such as the amount of data emited by a given host where network-aware application need to access the available bandwidth between host pairs. Some tools were designed specifically to provide such higher-level information (the most predominant being NWS – [51] ), but they are limited to quantitative information about the bandwidth, latency and processor availability.

We designed a tool to discover the network topology from an application point of view. We are mainly interested in predicating the effect of resource sharing between concurrent data stream. This information is for example crucial to schedule individual messages of group communications or to compute the optimal localization of backup servers and storage areas.

Testing and comparing the different possible heuristics to address this problem is difficult since it comes done to assessing how similar the discovered graph is from the real platform topology. We designed a testing framework on simulator, allowing to do so by comparing the performance of classical applications both on the discovered platform and on the real one. This comparison metric thus captures how the discovered platform matches the real one from the application point of view .

We compared several heuristics presented theoretically in the literature, and plan to improve them in a near future [43] .

Grid'5000

Grid'5000 aims at building an experimental Grid platform featuring a total of five thousands CPUs over nine sites in France. We have built one of these site by installing a 47-nodes HP cluster. Each compute node of the HP cluster has two 2 GHz AMD Opteron 246 with 2 GB of RAM and runs under Linux Debian. The cluster is connected to the grid through a 10 Gigabit Ethernet network, provided by Renater. We were the first site to provide this 10 Gigabit uplink. We manage the day to day usage of the cluster and regularly update it to fit as close as possible the Grid'5000 recommendations.

We support the local and national Grid'5000 users by helping them using the platform. We provide them trainings and we try to find with them the best way for their experiments to use the grid. We take a significant part in the organization of the "Grid'5000 spring school 2006" and we mount our own education-day called "Journée Grid'5000 au Loria".

Each Grid'5000 site aims at providing at least five hundreds CPUs. We have designed our needs for a second cluster and conduct its purchase. This new machine is a 120-nodes HP cluster. Each compute node has two 1.6 GHz Dual-core Intel Xeon 5110 with 2 GB of RAM and two gigabit Ethernet interfaces. We plan to receive and install this cluster at the end of the year.

We take a significant part in the national development of the Grid'5000 platform. We help consolidating the production infrastructure, by developing tools like the account management software called cpu-g5k (https://gforge.inria.fr/projects/grid5000/) or creating Linux-based grid environment for users. We also participate at most of the existing workgroups of the project, like the one for the next Kadeploy version (https://gforge.inria.fr/projects/kadeploy/ https://gforge.inria.fr/projects/kadeploy/). Moreover we take in charge some of the collaborative services like the wiki website and the bug reporting tool.


previous
next

Logo Inria