Team GRAAL

Members
Overall Objectives
Scientific Foundations
Application Domains
Software
New Results
Contracts and Grants with Industry
Other Grants and Activities
Dissemination
Bibliography

Section: Software

DIET

Participants : Nicolas Bard, Yves Caniou, Eddy Caron [correspondent] , Ghislain Charrier, Frédéric Desprez, Adrian Muresan, Vincent Pichon.

Huge problems can now be processed over the Internet thanks to Grid middleware systems. The use of on-the-shelf applications is needed by scientists of other disciplines. Moreover, the computational power and memory needs of such applications may of course not be met by every workstation. Thus, the RPC paradigm seems to be a good candidate to build Problem Solving Environments on the Grid. The aim of the Diet project (http://graal.ens-lyon.fr/DIET ) is to develop a set of tools to build computational servers accessible through a GridRPC API.

Moreover, the aim of a middleware system such as Diet is to provide a transparent access to a pool of computational servers. Diet focuses on offering such a service at a very large scale. A client which has a problem to solve should be able to obtain a reference to the server that is best suited for it. Diet is designed to take into account the data location when scheduling jobs. Data are kept as long as possible on (or near to) the computational servers in order to minimize transfer times. This kind of optimization is mandatory when performing job scheduling on a wide-area network. Diet is built upon Server Daemons. The scheduler is scattered across a hierarchy of Local Agents and Master Agents.

Applications targeted for the Diet platform are now able to exert a degree of control over the scheduling subsystem via plug-in schedulers [88] . As the applications that are to be deployed on the Grid vary greatly in terms of performance demands, the Diet plug-in scheduler facility permits the application designer to express application needs and features in order that they be taken into account when application tasks are scheduled. These features are invoked at runtime after a user has submitted a service request to the MA, which broadcasts the request to its agent hierarchy.

Diet has been validated on several applications. Some of them have been described in Sections  4.2 through  4.7 .

Workflow support

Workflow-based applications are scientific, data intensive applications that consist of a set of tasks that need to be executed in a certain partial order. These applications are an important class of Grid applications and are used in various scientific domains like astronomy or bioinformatics. We have developed a workflow engine in Diet to manage such applications and propose to the end-user and the developer a simple way either to use provided scheduling algorithms or to develop their own scheduling algorithm.

In our implementation, workflows are described using the XML language. Since no standard exists for scientific workflows, we have proposed our formalism. The Diet agent hierarchy has been extended with a new special agent, the MA_DAG. To be flexible we can execute workflows even if this special agent is not present in the platform. The use of the MA_DAG centralizes the scheduling decisions and thus can provide a better scheduling when the platform is shared by multiple clients. On the other hand, if the client bypasses the MA_DAG, a new scheduling algorithm can be used without affecting the Diet platform. The current implementation of Diet provides several schedulers (Round Robin, HEFT, random, Fairness on finish Time, etc.).

The Diet workflow runtime also includes a rescheduling mechanism. Most workflow scheduling algorithms are based on performance predictions that are not always accurate (erroneous prediction tool or resource load wrongly estimated). The rescheduling mechanism can trigger the application rescheduling when some conditions specified by the client are filled.

We also continued our work on schedulers for Diet workflow engine concerning multi-workflows based applications, and graphical tools for workflows within the Diet DashBoard project. Within the Gwendia project, we worked on the implementation of the language defined in the project and around the Cardiac application. Experiments were done over the Grid'5000 platform.

Diet Data Management

DAGDA, designed during the PhD of Gaël Le Mahec, is a new data manager for the Diet middleware which allows data explicit or implicit replications and advanced data management on the grid. It was designed to be backward compatible with previously developed applications for Diet which benefit transparently of the data replications. It allows explicit or implicit data replications, file sharing between the nodes which can access to the same disk partition, the choice of a data replacement algorithm, and a high level configuration about the memory and disk space Diet should use for the data storage and transfers. To transfer a data, DAGDA uses the pull model instead of the push model used by DTM. The data are not sent into the profile from the source to the destination, but they are downloaded by the destination from the source. DAGDA also chooses the best source for a given data. DAGDA has also been used for the validation of our join replication and scheduling algorithms over Diet .

GridRPC Data Management API

The GridRPC paradigm is now an OGF standard. The GridRPC community has interests in the Data Management within the GridRPC paradigm. Because of previous works performed in the Diet middleware concerning Data Management, Eddy Caron is now co-chair of the GridRPC working group in order to lead the project to propose a powerful Grid Data Management API which will extend the GridRPC paradigm.

Data Management is a challenging issue inside the OGF GridRPC standard, for feasability and performance reasons. Indeed some temporarily data do not need to be transfered once computed and can reside on servers for example. We can also imagine that data can be directly transferred from one server to another one, without being transferred to the client in accordance to the GridRPC paradigm behavior.

In consequence, we work on a Data Management API which has been presented to almost all OGF sessions since OGF'21. Since december 2009, the proposal is available for public comment and may be reached at: http://www.ogf.org/gf/docs/?public_comment under the title “Proposal for a Data Management API within the GridRPC. Y. Caniou and others, via GRIDRPC-WG”. Today public comment is closed and all remarks are included in the current document. This document is finished and will be standardized in 2011.

Middleware Interoperability

For the requirements of the GridTLSE project, Diet has been extended with a specialized version of a server daemon. It is able to provide access to the AEGIS middleware services developped in the JAEA. A demo has been presented in the JAEA booth at SuperComputing'10.

Diet as a Cloud System

Cloud computing is currently drawing more and more attention. This is due to multiple reasons, the most important of which are the on-demand way of provisioning resources and the pay-as-you-go pricing. In order to study and take advantage of these features, we extended the Diet middleware with Cloud support. Diet Cloud is a Diet module able of harnessing the extensibility of Cloud platforms in a seamless manner. We have targeted the Eucalyptus open-source Cloud because it implements the Amazon EC2 Cloud interface. Recently we also confirmed Diet Cloud's compatibility to Amazon EC2 by building a proof-of-concept demo which was shown at SuperComputing'10.

Diet Green

We have designed a new metric called (GreenPerf) to allow to Diet to provide a scheduler that takes into account the energy information. We designed a heuristic to find the best server with the good rate between performance and electric consumption. In collaboration with Laurent Lefevre from the RESO research team, we designed the architecture to deal with energy sensors. More developments and experiments are required to validate the integration into the current release.

MapReduce over Diet

The MapReduce programming model (re-)introduced by Google is a promising model to deploy data processing application services over large scale platforms such as Grids and Clouds. We developed a version of MapReduce over Diet . In particular, we automatized the creation of MapReduce-type workflows. Some large-scale experiments over the Grid’5000 platform were conducted to validate the concepts and algorithms developed.

For each input key/value pair, the Diet workflow engine generates one map task. Each map calculates intermediate key/value pairs and returns a container with all intermediate pairs. Thanks to the Diet workflow engine, the results are merged and all containers are sent to the sort service. This service sorts the pairs by combining one key and all its values in a container. This container is itself added to a container that is returned by the service. The Diet workflow engine then explodes the main container, and it creates a reducing task for each element. Reducing tasks calculate and return final key/value pairs. All final pairs are then merged and returned to the client.

We implemented a prototype with the sorting service and a prototype with tree reduction. These prototypes allowed us to validate the feasibility of two solutions and the constraints imposed by the Diet midlleware.

DIET and EDF R&D

We worked on the Diet integration into the EDF infrastructure in the context of the INRIA Grenoble-Rhône-Alpes GRAAL, EDF R&D SINETICS OSIS partnership. The first work was to provide a set of new functionalities for users to submit a large amount of tasks on different remote LRMS (Local Resources Manager Systems), to manage and tune these tasks and to finally retrieve the results. The solution is based on Diet modules that can be called in C/C++, or called directly from the command line.

Latest Releases

Moreover, since May, special developments around the File and Batch management of a HPC infrastructure for EDF R&D are available in open source.


previous
next

Logo Inria