Section: Scientific Foundations

Numerical algorithms and High Performance Computing

The focus of this topic is the design of efficient and robust numerical parallel algorithms for computational engineering. The objective is to deal with large scale numerical simulations. High performance computing is required in order to tackle large scale problems. Algorithms and solvers are applied to problems arising from hydrogeology and geophysics ( 4.1 ).

A problem at the kernel of most scientific applications consists in solving large linear systems of equations Ax = b , where the matrix A has a sparse structure (many coefficients are zero). The target is Giga-systems with billions (109 ) of unknowns.

Direct linear solvers

Direct methods, based on the factorization A = LU , induce fill-in in matrices L and U . Reordering techniques can be used to reduce this fill-in, hence memory requirements and floating-point operations  .

More precisely, direct methods involve two steps, first factoring the matrix A into the product A = P1LUP2 where P1 and P2 are permutation matrices, L is lower triangular, and U is upper triangular, then solving P1LUP2x = b by processing one factor at a time. The most time consuming and complicated step is the first one, which is further broken down into the following steps :

• Choose P1 and diagonal matrices D1 and D2 so that P1D1AD2 has a “large diagonal.” This helps to assure accuracy of the final solution.

• Choose P2 so that the L and U factors of P1AP2 are as sparse as possible.

• Perform symbolic analysis, i.e. identify the locations of nonzero entries of L and U .

• Factorize P1AP2 into L and U .

The team worked on parallel sparse direct solvers  and compares existing direct and iterative solvers.

Iterative linear solvers

The two main classes of iterative solvers are Krylov methods and multigrid methods.

A Krylov subspace is for example . If the matrix is symmetric positive definite, the Krylov method of choice is the Conjugate Gradient; for symmetric undefinite matrices, there are mainly three methods, SYMMLQ, MINRES and LSQR. For unsymmetric matrices, it is not possible to have both properties of minimization and short recurrences. The GMRES method minimizes the error but must be restarted to limit memory requirements. The BICGSTAB and QMR methods have short recurrences but do not guarantee a decreasing residual  ,  . All iterative methods require preconditioning to speed-up convergence : the system M-1Ax = M-1b is solved, where M is a matrix close to A such that linear systems Mz = c are easy to solve. A family of preconditioners uses incomplete factorizations A = LU + R , where R is implicitely defined by the level of fill-in allowed in L and U . Other types of preconditioners include an algebraic multigrid approach, an approximate inverse or a domain decomposition  .

Multigrid methods can be used as such or as a preconditioner. They can be either geometric or algebraic  .

The team studies preconditioners for Krylov methods  ,  and uses multigrid methods. The team works also on the development of parallel software for iterative solvers (PCG, GMRES, subdomain method), least-squares solvers (QR factorization).

Domain decomposition methods

Domain decomposition methods are hybrid methods or semi-iterative methods using iterative and direct techniques. They can be based on alternating Schwarz method when domain overlap or on Schur complement method whithout overlapping  . Schwarz methods can be used as preconditioners of Krylov methods or directly with an acceleration based on Aitken extrapolation. Schur methods lead to a reduced system, solved by a preconditioned Krylov method.

The team studies these various aspects of domain decomposition methods.

Linear least-squares problems

For linear least-squares problems , direct methods are based on the normal equations ATAx = ATb , using either a Cholesky factorization of ATA or a QR factorization of A , whereas the most common Krylov iterative method is LSQR. If the discrete problem is ill-posed, regularization like Tychonov or a Truncated Singular Value Decomposition (TSVD) is required  ,  . For large matrices, the so-called complete factorization is also useful. The first step is a pivoted QR factorization, followed by a second factorization where U and V are orthogonal matrices and E is a matrix neglectable with respect to the chosen threshold. Such a decomposition is a robust rank-revealing factorization and it provides for free the Moore-Penrose Generalized Inverse. Recently, efficient QR factorization software libraries became available but they do not consider column or row permutations based on numerical considerations since the corresponding orderings often end up with a non tractable level of fill-in.

The team studies iterative Krylov methods for regularized problems, as well as rank-revealing QR factorizations.

Nonlinear problems and time integration

Nonlinear methods to solve F(x) = 0 include fixed-point methods, nonlinear stationary methods, secant method, Newton method  ,  ,  . The team studies Newton-Krylov methods, where the linearized problem is solved by a Krylov method  , Broyden methods, Proper Orthogonalization Decomposition methods.

Another subject of interest is time decomposition methods. The idea is to divide the time interval into subintervals, to apply a timestep in each subinterval and to apply a nonlinear correction at both ends of subintervals. This can be applied to explosive or oscillatory problems.

Eigenvalue problems

Let us consider the problem of computing some extremal eigenvalues of a large sparse and symmetric matrix A . The Davidson method is a subspace method that builds a sequence of subspaces, which the initial problem is projected on. At every step, approximations of the sought eigenpairs are computed : let Vm be an orthonormal basis of the subspace at step m and let ( , z) be an eigenpair of the matrix Hm = VmTAVm  ; then the Ritz pair ( , x = Vmz) is an approximation of an eigenpair of A . The specificity of the method comes from how the subspace is augmented for the next step. In contrast to the Lanczos method, which is the method to refer to, the subspaces are not Krylov subspaces, since the new vector t = x + y which will be added to the subspace is obtained by an acceleration procedure : the correction y is obtained by an exact Newton step (Jacobi-Davidson method) or an inexact Newton step (Davidson method). The behavior of the Davidson method is studied in  while the Jacobi-Davidson method is described in  . These methods bring a substantial improvement over the Lanczos method when computing the eigenvalues of smallest amplitude. For that reason, the team considered Davidson method to compute the smallest singular values of a matrix B by applying them to the matrix BTB  .

Robust algorithms for characterizing spectra

In several applications, the eigenvalues of a nonsymmetric matrix are often needed to decide whether they belong to a given part of the complex plane (e.g. half-plane of the negative real part complex numbers, unit disc). However, since the matrix is not exactly known (at most, the precision being the precision of the floating point representation), the result of the computation is not always guaranteed, especially for ill-conditioned eigenvalues. Actually, the problem is not to compute the eigenvalues precisely, but to characterize whether they lie in a given region of the complex field. For that purpose the notion of -spectrum or equivalently the notion of pseudospectrum was simultaneously introduced by Godunov  and Trefethen  . Several teams proposed softwares to compute pseudospectra, including the SAGE team with the software PPAT  , described in Section 5.1 .

Parallel spatial discretization

Our applications in hydrogeology and geophysics ( 4.1 ) are in the framework of Partial Differential Algebraic Equations (PDAE). We usually discretize time by a classical one-step or multi-step scheme and space by a Finite Element Method or a similar method. To get a fully parallel implementation, it is necessary to parallelize the matrix computation and generation. A common approach is to divide the computational domain into subdomains. Once the matrix is computed, it is used in linear solvers. The challenge is to reduce communication between the two phases. Recently, we have also investigated particle methods and we have developed a parallel particle tracker.

Parallel and distributed stochastic simulations

In our applications, we use stochastic modelling in order to take into account geophysical variability. From a numerical point of view, it amounts to run multiparametric simulations. The objective is to use the power of heterogeneous parallel and distributed architectures.

Logo Inria