Section: New Results
Large-scale data management for grids
Using transparent sharing of distributed data for databases
Since 2003, we have been working on the concept of data-sharing service for Grid computing, that we defined as a compromise between two rather different kinds of data-sharing systems:
DSM systems , which propose consistency models and protocols for efficient transparent management of mutable data, on static, small-scaled configurations (tens of nodes) ;
P2P systems , which have proven adequate for the management of immutable data on highly dynamic, large-scale configurations (millions of nodes) .
We illustrated this concept through the JuxMem software platform, mainly developed by our group within the framework of Mathieu Jan's PhD thesis  and Sébastien Monnet's PhD thesis  . JuxMem relies on the JXTA  generic peer-to-peer framework, which provides basic building blocks for user-defined, peer-to-peer services. L. Cudennec's PhD thesis is specifically devoted to improving the deployment of JXTA-based programs in the context of large-scale grid platforms such as Grid'5000 .
In 2007, we explored the possibility of building a distributed database management system (DBMS) on top of JuxMem , as a natural extension of previous approaches based on the distributed shared memory paradigm. The proposed approach consisted in providing the DBMS with a transparent, persistent and fault-tolerant access to the stored data, within a unstable, volatile and dynamic environment. The DBMS is thus alleviated from any concern regarding the dynamic behavior of the underlying nodes. This work has been done within the framework of the RESPIRE ANR project.
In 2008, this direction continued by exploring ways of integrating the concept of data-sharing service into an existing DBMS. During Marius Moldovan's Master internship, we experimented the interconnection between the BlobSeer prototype (mainly developed by Bogdan Nicolae) and the BerkeleyDB DBMS. In our prototype setting, BlobSeer serves as a block device for BerkeleyDB.
This work is continued within the framework of the PhD thesis of B. Nicolae, started in September 2007, with a focus on efficient storage and access to large data chunks.
Hierarchical Grid Storage based on the JuxMem Grid Data-Sharing Service and on the Gfarm Global File System
Within the framework of our collaboration with the Gfarm team from the Tsukuba University (Japan), we have defined a hybrid architecture which relies on both the JuxMem grid-data sharing service and the Gfarm Grid file system ( http://datafarm.apgrid.org/ ), and combines their specific benefits. The main idea was to allow applications to use JuxMem 's efficient memory-oriented API, while letting JuxMem persistently store data on disk files by transparently making calls to Gfarm in the background. This work has been validated through a prototype that couples the Gfarm file system with the JuxMem data-sharing service. The results have been published at Euro-Par 2008.
During André Lage's Master internship, we explored an alternative interaction between JuxMem and Gfarm: the goal was to use JuxMem to build a distributed, fault-tolerant metadata server that would replace Gfarm's centralized metadata server. The goal of sharing metadata thanks to JuxMem was only partially reached, for a subset of the metadata.
Toward transparent management of interactions between applications and resources
As a result of the past collaboration experience between the JuxMem group with the JXTA and Gfarm teams, the need of launching complex distributed applications on large-scale testbeds led us to work on the automation of the deployment process. This has been done in close collaboration with the designers of the Adage deployment tool (Christian Pérez, Landry Breuil). Software plugins for Adage were designed to handle the deployment, in a static manner, for JXTA , JuxMem and Gfarm -based applications. These plugins have been used in most of the experimentations involving such types of application.
Despite these efforts, the deployment of such applications remains quite painful for regular users, mostly because they are still in charge of the interactions with reservation and deployment tools. Therefore, the CoRDAGe model has been proposed to significantly facilitate the deployment of applications by introducing transparent, on-the-fly resource reservations in response to possibly variable needs. This model has been presented at the STHEC workshop  . A prototype has been implemented and experimentations have been conducted within the Grid'5000 platform. These results are part of Loïc Cudennec thesis to be defended on January 15th, 2009.
Toward efficent versioning for large objects under heavy access concurrency
Considering the problem of efficiently managing massive data in a large-scale distributed environment, we focus on data strings of sizes reaching the order of TB, shared and accessed at a fine grain by concurrent clients, as is the case with many data processing abstractions such as MapReduce. On each individual access, a segment of the blob, reaching the order of MB, is read or modified. Our goal is to provide the clients with an efficient fine-grain access and versioning interface to the blob that copes with heavy access concurrency, without the need to lock the blob itself. Our approach is illstrated through the BlobSeer prototype. The overall design enables several features:
support massive, distributed data blobs management (in the order of TB);
support for a large number of blobs;
efficient atomic fine grain access to each blob (e.g., in the order of MB);
implicit versioning: updates to each blob add rather than replace data and generate a new virtual global view of the blob;
powerful concurrency management : high performance concurrent read/read, read/write, write/write access;
little overhead of storage space despite versioning.
BlobSeer, our prototype currently in development serves to contuct experimentation on the Grid'5000 platform. Preliminary results demonstrate good scalability and performance. These results have been published in  and  .