Application Domains
New Software and Platforms
Bilateral Contracts and Grants with Industry
Partnerships and Cooperations
Bibliography
 PDF e-Pub

## Section: New Results

### Algorithmic aspects of topological and geometric data analysis

#### DTM-based filtrations

Participants : Frédéric Chazal, Marc Glisse, Raphaël Tinarrage.

In collaboration with H. Anai, Y. Ike, H. Inakoshi and Y. Umeda of Fujitsu.

Despite strong stability properties, the persistent homology of filtrations classically used in Topological Data Analysis, such as, e.g. the Čech or Vietoris-Rips filtrations, are very sensitive to the presence of outliers in the data from which they are computed. In this paper [33], we introduce and study a new family of filtrations, the DTM-filtrations, built on top of point clouds in the Euclidean space which are more robust to noise and outliers. The approach adopted in this work relies on the notion of distance-to-measure functions, and extends some previous work on the approximation of such functions.

#### Persistent Homology with Dimensionality Reduction: $k$-Distance vs Gaussian Kernels

Participants : Shreya Arya, Jean-Daniel Boissonnat, Kunal Dutta.

We investigate the effectiveness of dimensionality reduction for computing the persistent homology for both $k$-distance and kernel distance [34]. For $k$-distance, we show that the standard Johnson-Lindenstrauss reduction preserves the $k$-distance, which preserves the persistent homology upto a ${\left(1-\epsilon \right)}^{-1}$ factor with target dimension $O\left(klogn/{\epsilon }^{2}\right)$. We also prove a concentration inequality for sums of dependent chi-squared random variables, which, under some conditions, allows the persistent homology to be preserved in $O\left(logn/{\epsilon }^{2}\right)$ dimensions. This answers an open question of Sheehy. For Gaussian kernels, we show that the standard Johnson-Lindenstrauss reduction preserves the persistent homology up to an $4{\left(1-ϵ\right)}^{-1}$ factor.

#### Computing Persistent Homology of Flag Complexes via Strong Collapses

Participants : Jean-Daniel Boissonnat, Siddharth Pritam.

In collaboration with Divyansh Pareek (Indian Institute of Technology Bombay, India)

#### Strong Collapse for Persistence

Participants : Jean-Daniel Boissonnat, Siddharth Pritam.

In this paper, we build on the initial success of and show that further decisive progress can be obtained if one restricts the family of simplicial complexes to flag complexes. Flag complexes are fully characterized by their graph (or 1-skeleton), the other faces being obtained by computing the cliques of the graph. Hence, a flag complex can be represented by its graph, which is a very compact representation. Flag complexes are very popular and, in particular, Vietoris-Rips complexes are by far the most widely simplicial complexes used in Topological Data Analysis. It has been shown in that the persistent homology of Vietoris-Rips filtrations can be computed very efficiently using strong collapses. However, most of the time was devoted to computing the maximal cliques of the complex prior to their strong collapse. In this paper [37], we observe that the reduced complex obtained by strong collapsing a flag complex is itself a flag complex. Moreover, this reduced complex can be computed using only the 1-skeleton (or graph) of the complex, not the set of its maximal cliques. Finally, we show how to compute the equivalent filtration of the sequence of reduced flag simplicial complexes using again only 1-skeletons. x On the theory side, we show that strong collapses of flag complexes can be computed in time $O\left({v}^{2}{k}^{2}\right)$ where $v$ is the number of vertices of the complex and $k$ the maximal degree of its graph. The algorithm described in this paper has been implemented and the code will be soon released in the Gudhi library. Numerous experiments show that our method outperforms previous methods, e.g. Ripser.

#### Triangulating submanifolds: An elementary and quantified version of Whitney's method

Participants : Jean-Daniel Boissonnat, Siargey Kachanovich, Mathijs Wintraecken.

#### Randomized incremental construction of Delaunay triangulations of nice point sets

Participants : Jean-Daniel Boissonnat, Kunal Dutta, Marc Glisse.

In collaboration with Olivier Devillers (Inria, CNRS, Loria, Université de Lorraine).

Randomized incremental construction (RIC) is one of the most important paradigms for building geometric data structures. Clarkson and Shor developed a general theory that led to numerous algorithms that are both simple and efficient in theory and in practice.

Randomized incremental constructions are most of the time space and time optimal in the worst-case, as exemplified by the construction of convex hulls, Delaunay triangulations and arrangements of line segments.

However, the worst-case scenario occurs rarely in practice and we would like to understand how RIC behaves when the input is nice in the sense that the associated output is significantly smaller than in the worst-case. For example, it is known that the Delaunay triangulations of nicely distributed points in ${ℝ}^{d}$ or on polyhedral surfaces in ${ℝ}^{3}$ has linear complexity, as opposed to a worst-case complexity of $\Theta \left({n}^{⌊d/2⌋}\right)$ in the first case and quadratic in the second. The standard analysis does not provide accurate bounds on the complexity of such cases and we aim at establishing such bounds in this paper [35]. More precisely, we will show that, in the two cases above and variants of them, the complexity of the usual RIC is $O\left(nlogn\right)$, which is optimal. In other words, without any modification, RIC nicely adapts to good cases of practical value.

Along the way, we prove a probabilistic lemma for sampling without replacement, which may be of independent interest.

#### Approximate Polytope Membership Queries

Participant : Guilherme Da Fonseca.

In collaboration with Sunil Arya (Hong Kong University of Science and Technology) and David Mount (University of Maryland).

#### Approximate Convex Intersection Detection with Applications to Width and Minkowski Sums

Participant : Guilherme Da Fonseca.

In collaboration with Sunil Arya (Hong Kong University of Science and Technology) and David Mount (University of Maryland).

#### Approximating the Spectrum of a Graph

Participant : David Cohen-Steiner.

In collaboration with Weihao Kong (Stanford University), Christian Sohler (TU Dortmund) and Gregory Valiant (Stanford University).

#### Spectral Properties of Radial Kernels and Clustering in High Dimensions

Participants : David Cohen-Steiner, Alba Chiara de Vitis.

In this paper [40], we study the spectrum and the eigenvectors of radial kernels for mixtures of distributions in ${ℝ}^{n}$. Our approach focuses on high dimensions and relies solely on the concentration properties of the components in the mixture. We give several results describing of the structure of kernel matrices for a sample drawn from such a mixture. Based on these results, we analyze the ability of kernel PCA to cluster high dimensional mixtures. In particular, we exhibit a specific kernel leading to a simple spectral algorithm for clustering mixtures with possibly common means but different covariance matrices. This algorithm will succeed if the angle between any two covariance matrices in the mixture (seen as vectors in ${ℝ}^{{n}^{2}}$) is larger than $\Omega \left({n}^{-1/6}{log}^{5/3}n\right)$. In particular, the required angular separation tends to 0 as the dimension tends to infinity. To the best of our knowledge, this is the first polynomial time algorithm for clustering such mixtures beyond the Gaussian case.

#### Exact computation of the matching distance on 2-parameter persistence modules

Participant : Steve Oudot.

In collaboration with Michael Kerber (T.U. Graz) and Michael Lesnick (SUNY).

The matching distance is a pseudometric on multi-parameter persistence modules, defined in terms of the weighted bottleneck distance on the restriction of the modules to affine lines. It is known that this distance is stable in a reasonable sense, and can be efficiently approximated, which makes it a promising tool for practical applications. In [44] we show that in the 2-parameter setting, the matching distance can be computed exactly in polynomial time. Our approach subdivides the space of affine lines into regions, via a line arrangement. In each region, the matching distance restricts to a simple analytic function, whose maximum is easily computed. As a byproduct, our analysis establishes that the matching distance is a rational number, if the bigrades of the input modules are rational.

#### A Comparison Framework for Interleaved Persistence Modules

Participant : Miroslav Kramár.

In collaboration with Rachel Levanger (UPenn), Shaun Harker and Konstantin Mischaikow (Rutgers).

In [43], we present a generalization of the induced matching theorem of [1] and use it to prove a generalization of the algebraic stability theorem for R-indexed pointwise finite-dimensional persistence modules. Via numerous examples, we show how the generalized algebraic stability theorem enables the computation of rigorous error bounds in the space of persistence diagrams that go beyond the typical formulation in terms of bottleneck (or log bottleneck) distance.

#### Discrete Morse Theory for Computing Zigzag Persistence

Participant : Clément Maria.

In collaboration with Hannah Schreiber (Graz University of Technology, Austria)