Section: Overall Objectives
The Paris Project-Team aims at contributing to the programming of parallel and distributed systems for large-scale numerical simulation applications. Its goal is to design operating systems and middleware to ease the use of such computing infrastructure for the targeted applications. Such applications enable the speed-up of the design of complex manufactured products, such as cars or aircrafts, thanks to numerical simulation techniques.
As computer performance rapidly increases, it is possible to foresee in the near future comprehensive simulations of these designs that encompass multi-disciplinary aspects (structural mechanics, computational fluid dynamics, electromagnetism, noise analysis, etc.). Numerical simulations of these different aspects will not be carried out by a single computer due to the lack of computing and memory resources. Instead, several clusters of inexpensive PCs, and probably federations of clusters (aka. Grids ), will have to be simultaneously used to keep simulation times within reasonable bounds. Moreover, simulation will have to be performed by different research teams, each of them contributing its own simulation code. These teams may all belong to a single company, or to different companies possessing appropriate skills and computing resources, thus adding geographical constraints. By their very nature, such applications will require the use of a computing infrastructure that is both parallel and distributed.
The Paris Project-Team is engaged in research along five topics: Operating System and Runtime for Clusters and Grids , Middleware for Computational Grids , Large-scale Data Management for Grids , Advanced Models for the Grid and Experimental Grid Infrastructures .
Topic P2P System Foundations , that was described in the previous activity report, has been spinned-off to a new project-team, called ASAP , headed by Anne-Marie Kermarrec, a former member of the Paris Project-Team.
The research activities of the Paris Project-Team encompass both basic research, seeking conceptual advances, and applied research, to validate the proposed concepts against real applications. The project-team is also heavily involved in managing a national grid computing infrastructure (Grid 5000 ) enabling large-scale experiments.
Parallel processing to go faster
Given the significant increase of the performance of microprocessors, computer architectures and networks, clusters of standard personal computers now provide the level of performance to make numerical simulation a handy tool. This tool should not be used by researchers only, but also by a large number of engineers, designing complex physical systems. Simulation of mechanical structures, fluid dynamics or wave propagation can nowadays be carried out in a couple of hours. This is made possible by exploiting multi-level parallelism, simultaneously at a fine grain within a microprocessor, at a medium grain within a single multi-processor PC, and/or at a coarse grain within a cluster of such PCs. This unprecedented level of performance definitely makes numerical simulation available for a larger number of users such as SMEs. It also generates new needs and demands for more accurate numerical simulation. Traditional parallel processing alone cannot meet this demand.
Distributed processing to go larger
These new needs and demands are motivated by the constraints imposed by a worldwide economy: making things faster, better and cheaper.
Large-scale numerical simulation.
Large scale numerical simulation will without a doubt become one of the key technologies to meet such constraints. In traditional numerical simulation, only one simulation code is executed. In contrast, it is now required to couple several such codes together in a single simulation.
A large-scale numerical simulation application is typically composed of several codes, not only to simulate one physics, but to perform multi-physics simulation. One can imagine that the simulation times will be in the order of weeks and sometimes months depending on the number of physics involved in the simulation, and depending on the available computing resources.
Parallel processing extends the number of computing resources locally: it cannot significantly reduce simulation times, since the simulation codes will not be localized in a single geographical location. This is particularly true with the global economy, where complex products (such as cars, aircrafts, etc.) are not designed by a single company, but by several of them, through the use of subcontractors. Each of these companies brings its own expertise and tools such as numerical simulation codes, and even its private computing resources. Moreover, they are reluctant to give access to their tools as they may at the same time compete for some other projects. It is thus clear that distributed processing cannot be avoided to manage large-scale numerical applications
More generally, the development of large scale distributed systems and applications now rely on resource sharing and aggregation. Distributed resources, whether related to computing, storage or bandwidth, are aggregated and made available to the whole system. Not only this aggregation greatly improves the performance as the system size increases, but many applications would simply not have been possible without such a model (peer-to-peer file sharing, ad-hoc networks, application-level multicast, publish-subscribe applications, etc.).
Scientific challenges of the Paris Project-Team
The design of large-scale simulation applications raises technical and scientific challenges, both in applied mathematics and computer science. The Paris Project-Team mainly focuses its effort on Computer Science. It investigates new approaches to build software mechanisms that hide the complexity of programming computing infrastructures that are both parallel and distributed. Our contribution to the field can thus be summarized as follows:
combining parallel and distributed processing whilst preserving performance and transparency .
This contribution is developed along five directions.
- Operating system and runtime for clusters and grids.
The challenge is to design and build an operating system for clusters hiding to the programmers and the users, the fact that resources (processors, memories, disks) are distributed. A PC cluster with such an operating system looks like a traditional multi-processor running a Single System Image (SSI).
- Middleware for computational grids.
The challenge is to design a middleware implementing a component-based approach for grids. Large-scale numerical applications will be designed by combining together a set of components encapsulating simulation codes. The challenge is to seamlessly mix both parallel and distributed processing.
- Large-scale data management for grids.
One of the key challenges in programming grid computing infrastructures for real, is data management. It has to be carried out at an unprecedented scale, and to cope with the native dynamicity and heterogeneity of the underlying grids.
- Advanced models for the Grid.
This topic aims at contributing to study unconventional approaches for the programming of grids based on the chemical metaphors . The challenge is to exploit such metaphors to make the use, including the programming, of grids more intuitive and simpler.
- Experimental Grid Infrastructure.
The challenge here is to be able to design and to build an instrument (in the sense of a large scientific instrument, like a telescope) for computer scientists involved in grid research. Such an instrument has to be highly reconfigurable and scalable to several thousand of resources.