Project Team Graal

Members
Overall Objectives
Scientific Foundations
Application Domains
Software
New Results
Partnerships and Cooperations
Dissemination
Bibliography
PDF e-pub XML


Section: Partnerships and Cooperations

National Initiatives

ANR White Project Rescue , 4 years, 2010-2014

Participants : Anne Benoit, Loris Marchal, Yves Robert, Frédéric Vivien, Dounia Zaidouni.

The ANR White Project Rescue was launched in November 2010, for a duration of 48 months. It gathers three INRIA partners (Graal, Grand-Large and Hiepacs) and is led by Graal. The main objective of the project is to develop new algorithmic techniques and software tools to solve the exascale resilience problem. Solving this problem implies a departure from current approaches, and calls for yet-to-be-discovered algorithms, protocols and software tools.

This proposed research follows three main research thrusts. The first thrust deals with novel checkpoint protocols. The second thrust entails the development of novel execution models, i.e., accurate stochastic models to predict (and, in turn, optimize) the expected performance (execution time or throughput) of large-scale parallel scientific applications. In the third thrust, we will develop novel parallel algorithms for scientific numerical kernels.

ANR grant SPADES, 3 years, 08-ANR-SEGI-025, 2009-2012

Participants : Eddy Caron, Florent Chuffart, Frédéric Desprez, Haiwu He.

Today's emergence of Petascale architectures and evolutions of both research grids and computational grids increase a lot the number of potential resources. However, existing infrastructures and access rules do not allow to fully take advantage of these resources. One key idea of the SPADES project is to propose a non-intrusive but highly dynamic environment able to take advantage of the available resources without disturbing their native use. In other words, the SPADES vision is to adapt the desktop grid paradigm by replacing users at the edge of the Internet by volatile resources. These volatile resources are in fact submitted via batch schedulers to reservation mechanisms which are limited in time or susceptible to preemption (best-effort mode).

One of the priorities of SPADES is to support platforms at a very large scale. Petascale environments are therefore particularly considered. Nevertheless, these next-generation architectures still suffer from a lack of expertise for an accurate and relevant use. One of the SPADES goal is to show how to take advantage of the power of such architectures. Another challenge of SPADES is to provide a software solution for a service discovery system able to face a highly dynamic platform. This system will be deployed over volatile nodes and thus must tolerate failures. SPADES will propose solutions for the management of distributed schedulers in Desktop Computing environments, coping with a co-scheduling framework.

ANR grant: COOP (Multi Level Cooperative Resource Management), 3 years, ANR-09-COSI-001-01, 2009-2012

Participants : Frédéric Desprez, Cristian Klein, Christian Pérez.

The main goals of this project are to set up such a cooperation as general as possible with respect to programming models and resource management systems and to develop algorithms for efficient resource selection. In particular, the project targets the SALOME platform and GRID-TLSE expert-site (http://gridtlse.org/ ) as an example of programming models, and Marcel/PadicoTM, DIET and XtreemOS as examples of multithread scheduler/communication manager, grid middleware and distributed operating systems.

The project is led by Christian Pérez.

ANR JCJC: Clouds@Home (Cloud Computing over Unreliable, Shared Resources), 4 years, ANR-09-JCJC-0056-01, 2009-2012

Participants : Gilles Fedak, Bing Tang.

Recently, a new vision of cloud computing has emerged where the complexity of an IT infrastructure is completely hidden from its users. At the same time, cloud computing platforms provide massive scalability, 99.999% reliability, and speedy performance at relatively low costs for complex applications and services. This project, lead by D. Kondo from INRIA MESCAL investigates the use of cloud computing for large-scale and demanding applications and services over unreliable resources. In particular, we target volunteered resources distributed over the Internet. In this project, G. Fedak leads the Data management task (WP3).

ANR ARPEGE MapReduce (Scalable data management for Map-Reduce-based data-intensive applications on cloud and hybrid infrastructures), 4 years, ANR-09-JCJC-0056-01, 2010-2013

Participants : Julien Bigot, Frédéric Desprez, Gilles Fedak, Sylvain Gault, Christian Pérez, Anthony Simonet.

MapReduce is a parallel programming paradigm successfully used by large Internet service providers to perform computations on massive amounts of data. After being strongly promoted by Google, it has also been implemented by the open source community through the Hadoop project, maintained by the Apache Foundation and supported by Yahoo! and even by Google itself. This model is currently getting more and more popular as a solution for rapid implementation of distributed data-intensive applications. The key strength of the Map-Reduce model is its inherently high degree of potential parallelism.

In this project, the GRAAL team participates to several work packages which address key issues such as efficient scheduling of several MR applications, integration using components on large infrastructures, security and dependability, MapReduce for Desktop Grid.

ADT MUMPS, 3 years, 2009-2012

Participants : Maurice Brémond, Guillaume Joslin, Jean-Yves L'Excellent.

ADT-MUMPS is an action of technological development funded by Inria . Tools for experimentation, validation, and performance study of Mumps are being developed; one of the goals was also to efficiently use and benefit from the common porting, testing and compilation cluster from Inria , pipol.

ADT ALADDIN

Participants : Frédéric Desprez, Matthieu Imbert, Christian Pérez.

ALADDIN is an Inria action of technological development for “A LArge-scale DIstributed and Deployable INfrastructure” which aim is to manage the Grid'5000 experimental platform. Frédéric Desprez is leading this project (with David Margery from Rennes as the Technical Director).

ADT BitDew, 2 years, 2010-2012

Participants : Gilles Fedak, José Saray.

ADT BitDew is an INRIA support action of technological development for the BitDew middleware. Objectives are several fold : i/ provide documentation and education material for end-users, ii/ improve software quality and support, iii/ develop new features allowing the management of Cloud and Grid resources. The ADT BitDew, leaded by G. Fedak, allows to recruit a young engineer for 24 months.

HEMERA Large Wingspan Inria Project, 2010-2013

Participants : Daniel Balouek, Christian Pérez, Frédéric Vivien.

Hemera deals with the scientific animation of the Grid'5000 community. It aims at making progress in the understanding and management of large scale infrastructure by leveraging competences distributed in various French teams. Hemera contains several scientific challenges and working groups. Christian Pérez is leading the project that involves more than 20 teams located in 9 cities of France.

C. Pérez is leading the project and D. Balouek is managing scientific challenges on Grid'5000.

Action Interfaces Recherche en grille – Grilles de production. Institut des Grilles du CNRS – Action Aladdin INRIA

Participant : Yves Caniou.

This action addresses economical issues concerning green-ness in scientific and production grids. Different issues are addressed like the confrontation of energy models in place in experimental grids versus the operational realities in production grids, the study of new energy prediction models related to real measures of energy consumption in production grids, and the design of energy aware scheduling heuristics.

FastExpand: Regional Grant

Participant : Eddy Caron.

The FastExpand start'up asked to take benefit of the knowledge of the GRAAL research team on distributed systems and middleware systems. The aim of this company is to create games of new generation using a new distributed architecture. E. Caron and F. Desprez participate to this action. In 2011, a distributed prototype to work on burst requests from the MMORPG (Massively Multiplayer Online Role Playing Games) was successfully designed. The required performance has been reached.