Team grand-large

Members
Overall Objectives
Scientific Foundations
Application Domains
Software
New Results
Other Grants and Activities
Dissemination
Bibliography

Section: New Results

Nation Wide Experimental Platforms (testbed)

Grid'5000The Grid is envisioned to become a main infrastructure to provide seamless and transparent access to computing, storage, communications and service facilities to Internet users. After a first experimentation phase, generally with a low number of resources, new projects are unveiled with the objective to build large scale Grids combining hundreds of computers around the world for thousands of users. The European EGEE project is one example in Europe.

However, Grids are very complex objects because they are fundamentally distributed systems gathering complex and potentially volatile nodes, featuring a deep software stack and connected by possibility asynchronous (best effort) networks. These systems are so complex that it is not known if one can model their behaviour with enough accuracy to predict their properties (performance, fault tolerance, security, QoS) without realizing actual experimentations. Thus observations of real Grid, experimentations with real conditions, phenomena isolation and behaviours understanding are certainly important steps towards accurate models. In that perspective, experimental testbed are fundamental methodological tools, allowing experimentation and observation of large scale phenomena in Grid and their applications. Those aspects have been surveyed in [11] , [12] .

The Grid 5000 project aims at building and developing a nation wide highly reconfigurable experimental testbed allowing a large variety of experiments on all the different layers of the software stack between the users and the hardware. Grid 5000 seeks to ease and support experimentations and to provide rigorous control and measurement mechanisms.

In its current state, this instrument for Grid researchers is built gathering the resources of 8 computers centres (Grid 5000 sites), connected by RENATER (the French national network for research and education), offering to the users thousands of CPUs. Each sites host a PC clusters providing from 256 to 1000 nodes (CPUs). The Grid 5000 control and provisioning environment allows to configure and install a full software stack on each Grid 5000 cluster nodes. This will give the users the unique capability to setup the exact software environment required for his experiment. Thus, the user may specify the OS, network protocols, middleware, runtimes, application and more generally all components of the software stack needed for his experiment. In addition to this configuration capabilities, Grid 5000 will offer a set of tools controlling the experimental conditions during the execution of the experiment. Basically, the user will be able to start and stop every Grid 5000 nodes, on demand.

Grid 5000 is a multi-institutions project, gathering funding from the French Ministry of research, INRIA, CNRS, University and several regional councils. The direction of the project is ensured by a Steering Committee (SC) involving the director of the ACI Grid, Thierry Priol, The director of the ACI Grid scientific committee, Brigitte Plateau, the director of RENATER, Dany Vandromme and all leaders of Grid 5000 sites. The project is implemented by a team of engineers belonging to the technical committee (TC). More than hundred researchers (permanents and Ph. D. students) will use this instrument involving about 50 engineers (10 at full time).

Regarding the INRIA, Grid'5000 is a collaborative effort of several INRIA projects (by alphabetical order): Apache, Caiman, Grand-Large, Oasis, Paris, Remap, Reso, Runtime, Scalaplix. Thus several Research Units are involved (by alphabetical order): Futurs, Rennes, Rhone Alpes and Sophia. An overview of the Grid5000 project has been published in [20]

The role of Grand-Large in Grid 5000 is first the direction of the project, providing a vision, chairing the Steering Committee, preparing roadmaps and decisions to be discussed by the SC, preparing the SC meetings, etc. Second, Grand-Large, chairs the Technical Committee in charge of implementing the decisions of the SC and giving information to the SC to help the decision process. By being a central member in the SC and TC, Grand-Large plays a major role in Grid 5000.

One Expert Engineer funded by the INRIA is associated with this project at Orsay for its every day configuration and maintenance. Several Engineers from IDRIS and LRI have participated to the original design of Grid 5000 and help the Expert Engineer.

Grid eXplorerLarge scale distributed systems like P2P systems, Sensor Networks and Desktop Grids exhibit complex behaviours, difficult to understand because they fundamentally gather a large set of volatile nodes connected by an asynchronous network. Most of the well known techniques in distributed systems for fault tolerance do not work at this scale because their complexity is too high, they do not accept fault during some stabilisation phase or because the system is evolving in size too rapidly and too strongly. Like for the Grid, large scale distributed systems are complex to model and require prior experimentations and observations

To offer a respond to this challenge, the Grid eXplorer project aims at providing a large scale distributed system emulator. It consists first in building the emulator gathering hardware and software components and developing the unavailable software. In term of hardware, the project seeks to install a 1000 CPUs cluster using a non blocking Ethernet network as well as a non blocking high speed network for a subset of the cluster. As software, the emulation environment will allow users to configure all layers of the software stack for every experiment. In addition to this feature, shared with Grid 5000, Grid eXplorer will provide network emulators, virtualization mechanisms as well as fault injectors. The second part of the project is to address a variety of large scale distributed system issues by experimenting actual applications, distributed systems, OS and network protocols and testing new ones.

Grid eXplorer complements Grid 5000 by providing an experimental environment where the user has the capacity to control the network experimental conditions.

This project is supported by several funding sources: the ministry of research through the ACI Masses de données (Data Grid Explorer), INRIA, CNRS and the Ile de France regional council. The total budget of this project is about 2 Millions Euros.

Regarding the INRIA, Grid eXplorer is a collaborative effort of several INRIA projects (by alphabetical order): Apache, Grand-Large, Regal and Reso. Thus several Research Units are involved (by alphabetical order): Futurs, Rhone Alpes and Rocquencourt.

Grand-Large co-initiated this project, leads the ACI Masse de données project and managed the initial procurement as well as the hosting of the cluster, actually installed in the IDRIS laboratory at Orsay. One engineer is associated with this project at Orsay for its every day configuration and maintenance. This engineer is funded by the ACI Masse de données . Several Engineers from IDRIS and LRI have participated to the original design of Grid Explorer and help the Expert Engineer.


previous
next

Logo Inria