Section: Overall Objectives
General objectives
Paris is a joint project of Inria , CNRS , University Rennes 1 , and Insa within Irisa (UMR 6074).
In July 2009, KerData research team, headed by Luc Bougé, has been created, as a spin-off of Paris project-team. The research activities carried out by Luc Bougé, Gabriel Antoniu and their PhD students from January to December 2009 are all described in KerData annual activity report.
The Paris Project-Team aims at contributing to the programming of parallel and distributed infrastructures for large-scale numerical simulation applications. Its goal is to design operating systems and middleware to ease the use of such computing infrastructures for the targeted applications. Such applications enable the speed-up of the design of complex manufactured products, such as cars or aircrafts, thanks to numerical simulation techniques.
As computer performance rapidly increases, it is possible to foresee in the near future comprehensive simulations of these designs that encompass multi-disciplinary aspects (structural mechanics, computational fluid dynamics, electromagnetism, noise analysis, etc.). Numerical simulations of these different aspects are not carried out by a single computer due to the lack of computing and memory resources. Instead, several clusters of inexpensive PCs, and probably federations of clusters (aka. Grids ), have to be simultaneously used to keep simulation times within reasonable bounds. Moreover, simulations have to be performed by different research teams, each of them contributing its own simulation code. These teams may all belong to a single company, or to different companies possessing appropriate skills and computing resources, thus adding geographical constraints. By their very nature, such applications will require the use of a computing infrastructure that is both parallel and distributed.
The Paris Project-Team is engaged in research along five topics: Operating System and Runtime for Clusters and Grids , Middleware Systems for Computational Grids , Large-Scale Data Management for Grids , Advanced Programming Models for the Grid and Experimental Grid Infrastructures .
The research activities of the Paris Project-Team encompass both basic research, seeking conceptual advances, and applied research, to validate the proposed concepts against real applications. The project-team is also heavily involved in managing a national grid computing infrastructure (Grid'5000 ) enabling large-scale experiments.
Parallel processing to go faster
Given the significant increase of the performance of microprocessors, computer architectures and networks, clusters of standard personal computers now provide the level of performance to make numerical simulation a handy tool. This tool should not be used by researchers only, but also by a large number of engineers, designing complex physical systems. Simulation of mechanical structures, fluid dynamics or wave propagation can nowadays be carried out in a couple of hours. This is made possible by exploiting multi-level parallelism, simultaneously at a fine grain within a microprocessor, at a medium grain within a single multi-processor PC, and/or at a coarse grain within a cluster of such PCs. This unprecedented level of performance definitely makes numerical simulation available for a larger number of users such as SMEs. It also generates new needs and demands for more accurate numerical simulation. Parallel processing alone cannot meet this demand.
Distributed processing to go larger
These new needs and demands, mixing high-performance and collaborative multidisciplinary works, are motivated by the constraints imposed by a worldwide economy: making things faster, better and cheaper.
Large-scale numerical simulation.
Large scale numerical simulation will without a doubt become one of the key technologies to meet such constraints. In traditional numerical simulation, only one simulation code is executed. In contrast, it is now required to couple several such codes together in a single simulation.
A large-scale numerical simulation application is typically composed of several codes, not only to simulate one physics, but to perform multi-physics simulation. One can imagine that the simulation times will be in the order of weeks and sometimes months depending on the number of physics involved in the simulation, and depending on the available computing resources.
Parallel processing extends the number of computing resources locally: it cannot significantly reduce simulation times, since the simulation codes will not be localized in a single geographical location. This is particularly true with the global economy, where complex products (such as cars, aircrafts, etc.) are not designed by a single company, but by several of them, through the use of subcontractors. Each of these companies brings its own expertise and tools such as numerical simulation codes, and even its private computing resources. Moreover, they are reluctant to give access to their tools as they may at the same time compete for some other projects. It is thus clear that distributed processing cannot be avoided to manage large-scale numerical applications
Resource aggregation.
More generally, the development of large scale distributed systems and applications now relies on resource sharing and aggregation. Distributed resources, whether related to computing, storage or bandwidth, are aggregated and made available to the whole system. Not only does this aggregation greatly improve the performance as the system size increases, but also many applications would simply not have been possible without such a model (peer-to-peer file sharing, ad-hoc networks, application-level multicast, publish-subscribe applications, etc.).
Scientific challenges of the Paris Project-Team
The design of large-scale simulation applications raises technical and scientific challenges, both in applied mathematics and computer science. The Paris Project-Team mainly focuses its effort on Computer Science. It investigates new approaches to build software mechanisms that hide the complexity of programming computing infrastructures that are both parallel and distributed. Our contribution to the field can thus be summarized as follows:
combining parallel and distributed processing whilst preserving performance and transparency .
This contribution is developed along three directions.
- Operating system and runtime for clusters and grids.
The challenge is to design and build an operating system for clusters and grids hiding to programmers and users the fact that resources (processors, memories, disks) are distributed.
- Advanced programming models for the Grid.
This topic aims at contributing to study unconventional approaches for the programming of grids based on the chemical metaphors . The challenge is to exploit such metaphors to make the use and programming of grids more intuitive and simpler.
- Experimental Grid Infrastructures.
The challenge here is to be able to design and to build an instrument (in the sense of a large scientific instrument, like a telescope) for computer scientists involved in grid research. Such an instrument has to be highly reconfigurable and scalable to several thousand of resources.