Section: New Results
Providing access to HPC servers on the Grid
In many scientific areas, such as high-energy physics, bioinformatics, astronomy, and others, we encounter applications involving numerous simpler components that process large data sets, execute scientific simulations, and share both data and computing resources. Such data intensive applications consist of multiple components (tasks) that may communicate and interact with each other. The tasks are often precedence-related. Data files generated by one task are needed to start another. This problem is known as workflow scheduling. Surprisingly the problem of scheduling multiple workflows online does not appear to be fully addressed. We study many heuristics based on list scheduling to solve this problem. We also implemented a simulator in order to classify the behaviors of these heuristics depending on the shape and size of the graphs. Some of these heuristics are implemented within Diet and tested with the bioinformatics applications involved in the Décrypthon program.
We also work on scheduling workflows when services involved in workflows are not necessarily present on all computing resources. In that case, there is a need to correctly schedule services in order not to see only short term performance: for example, a powerful resource may stay idle in order to later be available to process a job that can run on no other resource. Numerous heuristics have been designed, and we currently evaluate them before implementing them in the Diet Grid middleware.
Moreover during the collaboration with the associate team of the University of Hawai`i at Manoā, we worked on the integration of our grid middleware Diet with a bioinformatics project: ByOPortal. This project consists in providing to non computer scientists means of running computational tasks on DNA or protein databases. We developed a generic Diet client/server to abstract the communication level between the web portal and the computation resources. This client/server is able to deal with any kind of jobs described as a series of command lines which need to be executed either sequentially or in parallel with data dependences, thus describing a dataflow. The innovative part of this work is that it is able to dynamically manage the shape of the dataflow depending on the number of outputs each step provides (thus the parallelism level is not statically set when the workflow is submitted). This work is currently tested on a cluster in the University of Hawai`i.
Service Discovery in Peer-to-Peer environments
This work combines grid computing, P2P systems and falls within the field of their interactions for service discovery in grid computing. The DLPT ( Distributed Lexicographic Placement Table ) is a prefix tree structured overlay network, developed within the GRAAL project since 2005, that provides large scale discovery of computing services.
This year, on the theoretical side, we have concentrated on two aspects. First, we completed a solution to the problems of logical/physical association and load balancing within such a network using some mapping mechanisms and load balancing heuristics.
Second, we have developed new approaches for the problem of fault tolerance within such architectures. In collaboration with Ajoy K. Datta (University of Nevada, Las Vegas, USA), we have designed a self-stabilizing protocol in the realistic message-passing paradigm.
On the practical side, our DLPT software prototype, developed in collaboration with the INRIA project-team RESO, has been deployed on the Grid'5000 platform. Results of first experiments conducted on several clusters are promising.
Deployment for DIET: Software and Research
Concerning the deployment, we needed to provide a clustering among resources. Indeed, Grid environments, as well as mobile ad hoc networks are highly distributed, changeable and error prone environments. In order to improve communication efficiency within these platforms, i.e., minimize the communication costs, an often used approach is to group well connected nodes into clusters. Many centralized and distributed algorithms cluster the graphs according to a certain metric: the hop distance, or the weighted distance. We are interested more specifically in designing a k-clustering, that is to group the nodes into clusters such that within a cluster no node is further than k from a special node called clusterhead. In order to cope with errors and dynamicity of the platforms, designing self-stabilizing algorithms is a good approach. We designed the first self-stabilizing k-clustering algorithm on weighted graph, proved its correctness and complexity, and implemented a simulator to validate its efficiency. This work is a joint work with Ajoy K. Datta and Lawrence L. Larmore (University of Nevada, Las Vegas, USA).
Join Scheduling and Data Management
Usually, in existing Grid computing environments, data replication and scheduling are two independent tasks. In some cases, replication managers are requested to find best replicas in term of access costs. But the choice of the best replica has to be done at the same time as the schedule of computation requests. We first proposed an algorithm that computes at the same time the mapping of data and computational requests on these data using a linear program and a method to obtain a mixed solution, i.e., integer and rational numbers, of this program. However our results only held if the submitted requests precisely followed the usage frequencies given as an input for the static replication and scheduling algorithm. Due to the specificity of biological experiments, these frequencies may punctually change. To cope with those changes, we developed a dynamic algorithm and a set of heuristics that monitor the execution platform and take decision to move data and change scheduling of requests. The main goal of this algorithm is to balance the computation load between each server. Using the Optorsim simulator, we compared the results of the different heuristics. The conclusion of these simulations is that we have a set of heuristics that, in the case of our hypothesis, are able to reliably adapt the data placement and request scheduling to get an efficient usage of all computation resources.
In this previous work, we designed a scheduling strategy based on the hypothesis that, if you choose a large enough time interval, the proportion of a job using a given data is always the same. As observed in execution traces of bioinformatics clusters, this hypothesis seems to correspond to the way these clusters are generally used. However, this algorithm does not take into account the initial data distribution costs and, in its original version, the dynamicity of the submitted jobs proportions. We introduced algorithms that allow to get good performance as soon as the process starts and take care about the data redistribution when needed. We want to run a continuous stream of jobs, using linear-time algorithms that depend on the size of the data on which they are applied. Each job is submitted to a Resource Broker which chooses a Computing Element (CE) to queue the job on it. When a job is queued on a CE, it waits for the next worker node that can execute it, with a FIFO policy. These algorithms try to take into account the temporary changes in the usage of the platform and do not need to obtain dynamic information about the nodes (cpu load, free memory, etc.). The only information used to make the scheduling decisions is the frequency of each kind of job submitted. Thus, the only information needed by the scheduler is collected by the scheduler itself avoiding the use of complex platform monitoring services. In a next step, we will concentrate on the data redistribution process which is itself a non-trivial problem. We will study some redistribution strategies to improve the performance of the algorithms which dynamically choose where to replicate the data on the platform. Some large scale experiments have been already done on the Grid'5000 experimental platform using the Diet middleware. This work is done in collaboration with the PCSV team of the IN2P3 institute in Clermont-Ferrand.
Parallel Job Submission Management
We have used the Diet functionality to transparently submit parallel job to parallel resources in several experiments, some with Ramses (see Section 4.5 ) and others using the Décrypthon applications. A client/server for the Lammps software (see Section 4.2 ) is work in progress. Some mechanisms to tune moldable jobs have also been implemented and used with the Scotch sparse linear solver.
Scheduling and Deployment for Cosmological simulations
Cosmological simulations are parameter sweep applications, they require that the set of parameters is tested in order to find the best parameterization for the model. We are currently working on two particular softwares: GalaxyMaker and MoMaF, which purpose is to model the formation and evolution of galaxies. In order to be able to run such softwares on a grid environment, we developed a Diet client/server. It is able to dynamically spawn and delete services in order to improve the data management of the software: the amount of data produced in each intermediate step makes it hard to concurrently run many simulations, one needs to correctly manage the data migration and deletion. Deployment and scheduling algorithms for these workflows are currently being developed, but still need to be validated.
Grid Middleware Interoperability
In the context of the Redimps project, Diet has been extended with a protocol interoperability with the ITBL middleware which manages Japanese computing resources in the JAEA (Japan Atomic Energy Agency). A demo has been presented at the INRIA booth for Supercomputing'08.