Team MESCAL

Members
Overall Objectives
Scientific Foundations
Application Domains
Software
New Results
Contracts and Grants with Industry
Other Grants and Activities
Dissemination
Bibliography

Section: New Results

Scheduling

Participants : Jean-Michel Fourneau, Bruno Gaujal, Arnaud Legrand, Jean-François Méhaut.

Mean-Field Analysis

In [28] , we study the limit behavior of Markov decision processes (MDPs) made of independent particles evolving in a common environment, when the number of particles goes to infinity. In the finite horizon case or with a discounted cost and an infinite horizon, we show that when the number of particles becomes large, the optimal cost of the system converges almost surely to the optimal cost of a deterministic system (the ”optimal mean field”). Convergence also holds for optimal policies. We further provide insights on the speed of convergence by proving several central limits theorems for the cost and the state of the Markov decision process with explicit formulas for the variance of the limit Gaussian laws. Then, our framework is applied to a brokering problem in grid computing. The optimal policy for the limit deterministic system is computed explicitly. Several simulations with growing numbers of processors are reported. They compare the performance of the optimal policy of the limit system used in the finite case with classical policies (such as Join the Shortest Queue) by measuring its asymptotic gain.

Work Stealing for Streaming Systems

In [16] , we study the performance of parallel stream computations on a multiprocessor architecture using a work-stealing strategy. Incoming tasks are split in a number of jobs allocated to the processors and whenever a processor becomes idle, it steals a fraction (typically half) of the jobs from a busy processor. We propose a new model for the performance analysis of such parallel stream computations. This model takes into account both the algorithmic behavior of work-stealing as well as the hardware constraints of the architecture (synchronizations and bus contentions). Then, we show that this model can be solved using a recursive formula. We further show that this recursive analytical approach is more efficient than the classic global balance technique. However, our method remains computationally impractical when tasks split in many jobs or when many processors are considered. Therefore, bounds are proposed to efficiently solve very large models in an approximate manner. Experimental results show that these bounds are tight and robust so that they immediately find applications in optimization studies. An example is provided for the optimization of energy consumption with performance constraints. In addition, our framework is flexible and we show how it adapts to deal with several stealing strategies.

Scheduling of Computing Services on Virtual Clusters

In [13] , we consider a context where the available resources of the Intranet of a company are used as a virtual cluster for scientific computation, during the idle periods (nights, weekends, holidays,...). Generally, these idle periods do not permit one to carry out completely the computations. For instance, a workstation mobilized during the night must be released in the morning to make it available for the employee, even if the application running on it is not completed. It is therefore necessary to save the context of uncompleted applications for a possible restart. Hereafter, we assume that the computations running on the workstations are independent from each other. The checkpointing mechanism which ensures the continuity of applications is subject to resource constraints : the network bandwidth, the disk bandwidth and the delay T imposed for releasing the workstations. We first show that the designing of a scheduling strategy which optimizes resource consumption while taking into account the above constraints, can be formalized as a variant of the classical 0/1 knapsack problem. Then, we propose an algorithm whose implementation does not have a significant overhead on checkpointing mechanisms. Experiments carried out on a real cluster show that this algorithm performs better than the naive scheduling algorithm which selects applications one after the other in order of decreasing amount of resource consumption.


previous
next

Logo Inria