PDF e-Pub

## Section: New Results

### Algorithmic aspects of topological and geometric data analysis

#### Sampling and Meshing Submanifolds

Participants : Jean-Daniel Boissonnat, Siargey Kachanovich.

In collaboration with Mathijs Wintraecken (IST Autria).

This work [41], [11] presents a rather simple tracing algorithm to sample and mesh an $m$-dimensional submanifold of ${ℝ}^{d}$ for arbitrary $m$ and $d$. We extend the work of Dobkin et al. to submanifolds of arbitrary dimension and codimension. The algorithm is practical and has been thoroughly investigated from both theoretical and experimental perspectives. The paper provides a full description and analysis of the data structure and of the tracing algorithm. The main contributions are : 1. We unify and complement the knowledge about Coxeter and Freudenthal-Kuhn triangulations. 2. We introduce an elegant and compact data structure to store Coxeter or Freudenthal-Kuhn triangulations and describe output sensitive algorithms to compute faces and cofaces or any simplex in the triangulation. 3. We present a manifold tracing algorithm based on the above data structure. We provide a detailled complexity analysis along with experimental results that show that the algorithm can handle cases that are far ahead of the state-of-the-art.

#### Topological correctness of PL-approximations of isomanifolds

Participant : Jean-Daniel Boissonnat.

In collaboration with Mathijs Wintraecken (IST Autria).

Isomanifolds are the generalization of isosurfaces to arbitrary dimension and codimension, i.e. manifolds defined as the zero set of some multivariate multivalued function $f:{ℝ}^{d}\to {ℝ}^{d-n}$. A natural (and efficient) way to approximate an isomanifold is to consider its Piecewise-Linear (PL) approximation based on a triangulation $𝒯$ of the ambient space ${ℝ}^{d}$. In this paper [43], we give conditions under which the PL-approximation of an isomanifold is topologically equivalent to the isomanifold. The conditions are easy to satisfy in the sense that they can always be met by taking a sufficiently fine triangulation $𝒯$. This contrasts with previous results on the triangulation of manifolds where, in arbitrary dimensions, delicate perturbations are needed to guarantee topological correctness, which leads to strong limitations in practice. We further give a bound on the Fréchet distance between the original isomanifold and its PL-approximation. Finally we show analogous results for the PL-approximation of an isomanifold with boundary.

#### Dimensionality Reduction for $k$-Distance Applied to Persistent Homology

Participants : Jean-Daniel Boissonnat, Kunal Dutta.

In collaboration with Shreya Arya (Duke University)

Given a set $P$ of $n$ points and a constant $k$, we are interested in computing the persistent homology of the Čech filtration of $P$ for the $k$-distance, and investigate the effectiveness of dimensionality reduction for this problem, answering an open question of Sheehy [Proc. SoCG, 2014] [38]. We first show using the Johnson-Lindenstrauss lemma, that the persistent homology can be preserved up to a $\left(1±ϵ\right)$ factor while reducing dimensionality to $O\left(klogn/{\epsilon }^{2}\right)$. Our main result shows that the target dimension can be improved to $O\left(logn/{\epsilon }^{2}\right)$ under a reasonable and naturally occuring condition. The proof involves a multi-dimensional variant of the Hanson-Wright inequality for subgaussian quadratic forms and works when the random matrices are used for the Johnson-Lindenstrauss mapping are subgaussian. This includes the Gaussian matrices of Indyk-Motwani, the sparse random matrices of Achlioptas and the Ailon-Chazelle fast Johnson-Lindenstrauss transform. To provide evidence that our condition encompasses quite general situations, we show that it is satisfied when the points are independently distributed $\left(i\right)$ in ${ℝ}^{D}$ under a subgaussian distribution, or $\left(ii\right)$ on a spherical shell in ${ℝ}^{D}$ with a minimum angular separation, using Gershgorin's theorem. Our results also show that the JL-mapping preserves up to a $\left(1±ϵ\right)$ factor, the Rips and Delaunay filtrations for the $k$-distance, as well as the Čech filtration for the approximate $k$-distance of Buchet et al.

#### Edge Collapse and Persistence of Flag Complexes

Participants : Jean-Daniel Boissonnat, Siddharth Pritam.

In this article [42], we extend the notions of dominated vertex and strong collapse of a simplicial complex as introduced by J. Barmak and E. Miniam adn build on the initial success of [30]. We say that a simplex (of any dimension) is dominated if its link is a simplicial cone. Domination of edges appear to be very powerful and we study it in the case of flag complexes in more detail. We show that edge collapse (removal of dominated edges) in a flag complex can be performed using only the 1-skeleton of the complex. Furthermore, the residual complex is a flag complex as well. Next we show that, similar to the case of strong collapses, we can use edge collapses to reduce a flag filtration $ℱ$ to a smaller flag filtration ${ℱ}^{c}$ with the same persistence. Here again, we only use the 1-skeletons of the complexes. The resulting method to compute ${ℱ}^{c}$ is simple and extremely efficient and, when used as a preprocessing for Persistence Computation, leads to gains of several orders of magnitude wrt the state-of-the-art methods (including our previous approach using strong collapse). The method is exact, irrespective of dimension, and improves performance of Persistence Computation even in low dimensions. This is demonstrated by numerous experiments on publicly available data.

#### DTM-based Filtrations

Participants : Frédéric Chazal, Marc Glisse, Raphael Tinarrage.

In collaboration with Anai, Hirokazu and Ike, Yuichi and Inakoshi, Hiroya and Umeda, Yuhei (Fujitsu Labs).

Despite strong stability properties, the persistent homology of filtrations classically used in Topological Data Analysis, such as, e.g. the Čech or Vietoris-Rips filtrations, are very sensitive to the presence of outliers in the data from which they are computed. In [15], we introduce and study a new family of filtrations, the DTM-filtrations, built on top of point clouds in the Euclidean space which are more robust to noise and outliers. The approach adopted in this work relies on the notion of distance-to-measure functions, and extends some previous work on the approximation of such functions.

#### Recovering the homology of immersed manifolds

Participant : Raphael Tinarrage.

Given a sample of an abstract manifold immersed in some Euclidean space, in [57], we describe a way to recover the singular homology of the original manifold. It consists in estimating its tangent bundle -seen as subset of another Euclidean space- in a measure theoretic point of view, and in applying measure-based filtrations for persistent homology. The construction we propose is consistent and stable, and does not involve the knowledge of the dimension of the manifold.

#### Regular triangulations as lexicographic optimal chains

Participant : David Cohen-Steiner.

In collaboration with André Lieutier and Julien Vuillamy (Dassault Systèmes).

We introduce [46] a total order on n-simplices in the n-Euclidean space for which the support of the lexicographic-minimal chain with the convex hull boundary as boundary constraint is precisely the n-dimensional Delaunay triangulation, or in a more general setting, the regular triangulation of a set of weighted points. This new characterization of regular and Delaunay triangulations is motivated by its possible generalization to submanifold triangulations as well as the recent development of polynomial-time triangulation algorithms taking advantage of this order.

#### Discrete Morse Theory for Computing Zigzag Persistence

Participant : Clément Maria.

In collaboration with Hannah Schreiber (Graz University of Technology, Austria)

#### Computing Persistent Homology with Various Coefficient Fields in a Single Pass

Participants : Jean-Daniel Boissonnat, Clément Maria.

This article [18] introduces an algorithm to compute the persistent homology of a filtered complex with various coefficient fields in a single matrix reduction. The algorithm is output-sensitive in the total number of distinct persistent homological features in the diagrams for the different coefficient fields. This computation allows us to infer the prime divisors of the torsion coefficients of the integral homology groups of the topological space at any scale, hence furnishing a more informative description of topology than persistence in a single coefficient field. We provide theoretical complexity analysis as well as detailed experimental results. The code is part of the Gudhi software library.

#### Exact computation of the matching distance on 2-parameter persistence modules

Participant : Steve Oudot.

In collaboration with Michael Kerber (T.U. Graz) and Michael Lesnick (SUNY).

The matching distance is a pseudometric on multi-parameter persistence modules, defined in terms of the weighted bottleneck distance on the restriction of the modules to affine lines. It is known that this distance is stable in a reasonable sense, and can be efficiently approximated, which makes it a promising tool for practical applications. In [31] we show that in the 2-parameter setting, the matching distance can be computed exactly in polynomial time. Our approach subdivides the space of affine lines into regions, via a line arrangement. In each region, the matching distance restricts to a simple analytic function, whose maximum is easily computed. As a byproduct, our analysis establishes that the matching distance is a rational number, if the bigrades of the input modules are rational.

#### Decomposition of exact pfd persistence bimodules

Participant : Steve Oudot.

In collaboration with Jérémy Cochoy (Symphonia).

In [24] we identify a certain class of persistence modules indexed over ${ℝ}^{2}$ that are decomposable into direct sums of indecomposable summands called blocks. The conditions on the modules are that they are both pointwise finite-dimensional (pfd) and exact. Our proof follows the same scheme as the one for pfd persistence modules indexed over $ℝ$, yet it departs from it at key stages due to the product order not being a total order on ${ℝ}^{2}$, which leaves some important gaps open. These gaps are filled in using more direct arguments. Our work is motivated primarily by the study of interlevel-sets persistence, although the proposed results reach beyond that setting.

#### Level-sets persistence and sheaf theory

Participants : Nicolas Berkouk, Steve Oudot.

In collaboration with Grégory Ginot (Paris 13).

In [39] we provide an explicit connection between level-sets persistence and derived sheaf theory over the real line. In particular we construct a functor from 2-parameter persistence modules to sheaves over R, as well as a functor in the other direction. We also observe that the 2-parameter persistence modules arising from the level sets of Morse functions carry extra structure that we call a Mayer-Vietoris system. We prove classification, barcode decomposition, and stability theorems for these Mayer-Vietoris systems, and we show that the aforementioned functors establish a pseudo-isometric equivalence of categories between derived constructible sheaves with the convolution or (derived) bottleneck distance and the interleaving distance of strictly pointwise finite-dimensional Mayer-Vietoris systems. Ultimately, our results provide a functorial equivalence between level-sets persistence and derived pushforward for continuous real-valued functions.

#### Intrinsic Interleaving Distance for Merge Trees

Participant : Steve Oudot.

In collaboration with Ellen Gasparovic (Union College), Elizabeth Munch (Michigan State), Katharine Turner (Australian National University), Bei Wang (Utah), and Yusu Wang (Ohio-State).

Merge trees are a type of graph-based topological summary that tracks the evolution of connected components in the sublevel sets of scalar functions. They enjoy widespread applications in data analysis and scientific visualization. In [49] we consider the problem of comparing two merge trees via the notion of interleaving distance in the metric space setting. We investigate various theoretical properties of such a metric. In particular, we show that the interleaving distance is intrinsic on the space of labeled merge trees and provide an algorithm to construct metric 1-centers for collections of labeled merge trees. We further prove that the intrinsic property of the interleaving distance also holds for the space of unlabeled merge trees. Our results are a first step toward performing statistics on graph-based topological summaries.