## Section: New Results

### Application Domains

#### Material physics

##### EigenSolver

The adaptive vibrational configuration interaction algorithm has been introduced as a new eigennvalues method for large dimension problem. It is based on the construction of nested bases for the discretization of the Hamiltonian operator according to a theoretical criterion that ensures the convergence of the method. It efficiently reduce the dimension of the set of basis functions used and then we are able solve vibrationnal eigenvalue problem up to the dimension 15 (7 atoms). Beyond this molecule size, two major issues appear. First, the size of the approximation domain increases exponentially with the number of atoms and the density of eigenvalues in the target area.

This year we have worked on two main areas. First of all, not all the eigenvalues that are calculated are determined by spectroscopy and therefore do not interest chemists. Only eigenvalues with an intensity are relevant. Also, we have set up a selection of interesting eigenvalues using the intensity operator. This requires calculating the scalar product between the smallest eigenvalues and the dipole moment applied to an eigenvector to evaluate its intensity. In addition, to get closer to the experimental values, we introduced the Coriolis operator into the Hamiltonian. A document is being written on these last two points showing that we can reach for a molecule 10 atoms the area of interest (i.e. more than 2 400 eigenvalues). Moreover, we continue to extend our shared memory parallelization to distributed memory using the message exchange paradigm to speedup the eigensolver time.

#### Co-design for scalable numerical algorithms in scientific applications

##### Numerical and parallel scalable hybrid solvers in large scale calculations

We have been working with the NACHOS team on the treatment of the system of three-dimensional frequency-domain (or time-harmonic) Maxwell equations using a high order hybridizable discontinuous Galerkin (HDG) approximation method combined to domain decomposition (DD) based hybrid iterative-direct parallel solution strategies. The proposed HDG method preserves the advantages of classical DG methods previously introduced for the time-domain Maxwell equations, in particular in terms of accuracy and flexibility with regards to the discretization of complex geometrical features, while keeping the computational efficiency at the level of the reference edge element based finite element formulation widely adopted for the considered PDE system. We study in details the computational performances of the resulting DD solvers in particular in terms of scalability metrics by considering both a model test problem and more realistic large-scale simulations performed on high performance computing systems consisting of networked multicore nodes. More information on these results can be found in [2].

In the context of a parallel plasma physics simulation code, we perform a qualitative performance study between two natural candidates for the parallel solution of 3D Poisson problems that are multigrid and domain decomposition. We selected one representative of each of these numerical techniques implemented in state of the art parallel packages and show that depending on the regime used in terms of number of unknowns per computing cores the best alternative in terms of time to solution varies. Those results show the interest of having both types of numerical solvers integrated in a simulation code that can be used in very different configurations in terms of problem sizes and parallel computing platforms. More information on these results will be shortly available in an Inria scientific report.

##### Efficient Parallel Solution of the 3D Stationary Boltzmann Transport Equation for Diffusive Problems

In the context of a collaboration with EDF-Lab with the Phd of Salli Moustafa, we present an efficient parallel method for the deterministic solution of the 3D stationary Boltzmann transport equation applied to diffusive problems such as nuclear core criticality computations. Based on standard MultiGroup-Sn-DD discretization schemes, our approach combines a highly efficient nested parallelization strategy with the PDSA parallel acceleration technique applied for the first time to 3D transport problems. These two key ingredients enable us to solve extremely large neutronic problems involving up to ${10}^{12}$ degrees of freedom in less than an hour using 64 super-computer nodes.

These contributions have been published in Journal of Computational Physics (JCP) [7].

##### Bridging the Gap Between $\mathscr{H}$-Matrices and Sparse Direct Methods for the Solution of Large Linear Systems

For the sake of numerical robustness in aeroacoustics simulations, the solution techniques based on the factorization of the matrix associated with the linear system are the methods of choice when affordable. In that respect, hierarchical methods based on low-rank compression have allowed a drastic reduction of the computational requirements for the solution of dense linear systems over the last two decades. For sparse linear systems, their application remains a challenge which has been studied by both the community of hierarchical matrices and the community of sparse matrices. On the one hand, the first step taken by the community of hierarchical matrices most often takes advantage of the sparsity of the problem through the use of nested dissection. While this approach benefits from the hierarchical structure, it is not, however, as efficient as sparse solvers regarding the exploitation of zeros and the structural separation of zeros from non-zeros. On the other hand, sparse factorization is organized so as to lead to a sequence of smaller dense operations, enticing sparse solvers to use this property and exploit compression techniques from hierarchical methods in order to reduce the computational cost of these elementary operations. Nonetheless, the globally hierarchical structure may be lost if the compression of hierarchical methods is used only locally on dense submatrices. In [1], we have reviewed the main techniques that have been employed by both those communities, trying to highlight their common properties and their respective limits with a special emphasis on studies that have aimed to bridge the gap between them. With these observations in mind, we have proposed a class of hierarchical algorithms based on the symbolic analysis of the structure of the factors of a sparse matrix. These algorithms rely on a symbolic information to cluster and construct a hierarchical structure coherent with the non-zero pattern of the matrix. Moreover, the resulting hierarchical matrix relies on low-rank compression for the reduction of the memory consumption of large submatrices as well as the time to solution of the solver. We have also compared multiple ordering techniques based on geometrical or topological properties. Finally, we have opened the discussion to a coupling between the Finite Element Method and the Boundary Element Method in a unified computational framework.

##### Design of a coupled MUMPS - $\mathscr{H}$-Matrix solver for FEM-BEM applications

In that approach, the FEM matrix is eliminated by computing a Schur complement using MUMPS. Given the size of the BEM matrix, this can not be done in one operation, so it is done block by block, and added in the $\mathscr{H}$-matrix, which is then factorized to complete the process. The overall process yields an interesting boost in performance when compared to the previously existing approach that coupled MUMPS with a classical dense solver. However, a full comparison with all the other existing methods must still be performed (full $\mathscr{H}$-matrix solver with [22] or without nested dissection, iterative approaches, etc.).

##### Metabarcoding

Distance Geometry Problem (DGP) and Nonlinear Mapping (NLM) are two well established questions: DGP is about finding a Euclidean realization of an incomplete set of distances in a Euclidean space, whereas Nonlinear Mapping is a weighted Least Square Scaling (LSS) method. We show how all these methods (LSS, NLM, DGP) can be assembled in a common framework, being each identified as an instance of an optimization problem with a choice of a weight matrix. In [6], we studied the continuity between the solutions (which are point clouds) when the weight matrix varies, and the compactness of the set of solutions (after centering). We finally studied a numerical example, showing that solving the optimization problem is far from being simple and that the numerical solution for a given procedure may be trapped in a local minimum.

We are involved in the ADT Gordon ((partners: TADAAM (coordinator), STORM , HiePACS , PLEIADE ). The objectives of this ADT is to scale our slolver stack on a PLEIADE dimensioning metabarcoding application (multidimensional scaling method). Our goal is to be able to handle a problem leading to a distance matrix around 100 million individuals. Our contribution concerns the the scalability of the multidimensional scaling method and more particulary the random projection methods to speed up the SVD solver. Experiments son PlaFRIM and MCIA CURTA plateforms have allowed us to show that the solver stack was able to solve efficiently a large problem up to 300,000 individuals in less than 10 minutes on 25 nodes. This has highlighted that for these problem sizes the management of I/O, inputs and outputs with the disks, becomes critical and dominates calculation times.