## Section: New Results

### Theory

Participants : Anne Auger, Nikolaus Hansen.

The paper “Information-Geometric Optimization Algorithms: A Unifying Picture via Invariance Principles" has finally been published in the JMLR journal [3]. In this paper in collaboration with Yann Ollivier in particular, we lay the ground of stochastic optimization by means of information geometry. We provide a unified framework for stochastic optimization on arbitrary search spaces that allow to recover well-known algorithms on continuous or discrete search spaces and put them under the same umbrella of Information Geometric Optimization.

When analyzing the stability of Markov chains stemming from comparison-based stochastic algorithms, we are facing difficulties due to the fact that the Markov chains have the following form ${\Phi}_{t+1}=F({\Phi}_{t},{U}_{t+1})$ where $\{{U}_{t}:t\ge 0\}$ are i.i.d. random vectors and $F$ is a discontinuous function. The discontinuity comes from the comparison-based property of the algorithms. If $F$ were ${C}^{\infty}$ or ${C}^{1}$ we could prove easily stability properties like irreducibility and show that compact are small sets by investigating the underlying control model and showing that it has globally attracting states where controllability conditions hold using results developed by Sean Meyn and co-authors.

In the paper [2], we found that we can actually generalize to a great extent the results by Meyn to the case where ${\Phi}_{t+1}=F({\Phi}_{t},\alpha ({\Phi}_{t},{U}_{t+1}))$ where $F$ is ${C}^{1}$ and $\alpha $ is discontinuous but such that $\alpha (x,U)$ admits a lower-semi continuity density. We have proposed verifiable conditions for the irreducibility and aperiodicity and shown that compact sets are small sets.

The development of evolution strategies has been greatly driven by so-called progress rate or quality gain analysis where simplification assumptions are made to obtain quantitative estimate of progress in one step and deduce from it how to set different parameters like recombination weights, learning rates ...

This theory while very useful often relied on approximations that were not always well appraised, justified or clearly stated. We have been in the past rigorously deriving different progress rate results and related them to bounds on convergence rates. We have investigated rigorously the quality gain (that is progress measured in terms of objective function) on general convex quadratic functions using weighted recombination. This allowed to derive the dependency of the convergence rate of evolution strategies with respect to the eigenspectrum of the Hessian matrix of convex-quadratic function as well as give hints on how to set learning rate [4] and [9].