Section: Scientific Foundations
Scheduling for Parallel Sparse Direct Solvers and Combinatorial Scientific Computing
Participants : Guillaume Joslin, Maurice Brémond, Johannes Langguth, JeanYves L'Excellent, Bora Uçar, Mohamed SidLakhdar.
The solution of sparse systems of linear equations (symmetric or unsymmetric, often with an irregular structure) is at the heart of many scientific applications arising in various domains such as geophysics, chemistry, electromagnetism, structural optimization, and computational fluid dynamics. The importance and diversity of the fields of applications are our main motivation to pursue research on sparse linear solvers. Furthermore, in order to solve hard problems that result from everincreasing demand for accuracy in simulations, special attention must be paid to both memory usage and execution time on the most powerful parallel platforms (whose usage is necessary because of the volume of data and amount of computation required). This is done by specific algorithmic choices and scheduling techniques. From a complementary point of view, it is also necessary to be aware of the functionality requirements from the applications and from the users, so that robust solutions can be proposed for a large range of problems.
Because of their efficiency and robustness, direct methods (based on Gaussian elimination) are methods of choice to solve these types of problems. In this context, we are particularly interested in the multifrontal method [88] , [89] for symmetric positive definite, general symmetric or unsymmetric problems, with numerical pivoting in order to ensure numerical accuracy. The existence of numerical pivoting induces dynamic updates in the data structures where the updates are not predictable with a static or symbolic analysis approach.
The multifrontal method is based on an elimination tree [92] which results (i) from the graph structure corresponding to the nonzero pattern of the problem to be solved, and (ii) from the order in which variables are eliminated. This tree provides the dependency graph of the computations and is exploited to define tasks that may be executed in parallel. In the multifrontal method, each node of the tree corresponds to a task (itself can be potentially parallel) that consists in the partial factorization of a dense matrix. This approach allows for a good locality and hence efficient use of cache memories.
We are especially interested in approaches that are intrinsically dynamic and asynchronous [1] , [83] , as these approaches can encapsulate numerical pivoting and can be adopted to various computer architectures. In addition to their numerical robustness, the algorithms are based on a dynamic and distributed management of the computational tasks, not so far from today's peertopeer approaches: each process is responsible for providing work to some other processes and at the same time it acts as a worker for others. These algorithms are very interesting from the point of view of parallelism and in particular for the study of mapping and scheduling strategies for the following reasons:

the associated task graphs are very irregular and can vary dynamically,

they are currently used inside industrial applications, and

the evolution of high performance platforms, to the more heterogeneous and less predictable ones, requires that applications adapt themselves, using a mixture of dynamic and static approaches, as our approach allows.
Our research in this field is strongly linked to the software package Mumps (see Section 5.2 ) which is our main platform to experiment and validate new ideas and pursue new research directions. We are facing new challenges for very large problems (tens to hundreds of millions of equations) that occur nowadays in various application fields. The evolution of architectures towards clusters of multicore nodes and more and more parallelism is also a challenge that we are forced to face.
There are strong links between sparse direct methods and combinatorial scientific computing, which is more general. The aim of combinatorial scientific computing is to design combinatorial algorithms whose usage reduces the amount of resources needed for the solution of a target problem arising in scientific computing. The general approach is to identify issues that affect the performance of a scientific computing application (such as the memory use, the parallel speed up, etc.) and to develop combinatorial models and related algorithms to alleviate the issue. Our target scientific computing applications are the preprocessing phases of direct (in particular Mumps ), iterative, and hybrid methods for solving linear systems of equations, and the mapping of tasks (mostly the subtasks of such solvers) onto modern computing platforms. We will focus on the development and use of graph and hypergraph models, and related tools such as hypergraph partitioning, for load balancing and task mapping for parallel efficiency; and bipartite graph matching and vertex ordering for reducing the memory overhead and computational requirements of solvers. Although we direct our attention on these models and algorithms through the lens of linear system solvers, they are general enough to be applied to some other resource optimization problems.