## Section: New Results

### Learning theory

#### Data-driven calibration of linear estimators with minimal penalties (S. Arlot and F. Bach)

We tackle the problem of selecting among several linear estimators in non-parametric regression; this includes model selection for linear regression, the choice of a regularization parameter in kernel ridge regression or spline smoothing, and the choice of a kernel in multiple kernel learning. We propose a new algorithm which first estimates consistently the variance of the noise, based upon the concept of minimal penalty which was previously introduced in the context of model selection. Then, plugging our variance estimate in Mallows' C_{L} penalty is proved to lead to an algorithm satisfying an oracle inequality. Simulation experiments with kernel ridge regression and multiple kernel learning show that the proposed algorithm often improves significantly existing calibration procedures such as 10-fold cross-validation or generalized cross-validation [21] .

#### Resampling-based estimation of the accuracy of satellite ephemerides (S. Arlot, joint work with J. Desmars, J.-E. Arlot, V. Lainey and A. Vienne)

The accuracy of predicted orbital positions depends on the quality of the theorical model and of the observations used to fit the model. During the period of observations, this accuracy can be estimated through comparison with observations. Outside this period, the estimation remains difficult. Many methods have been developed for asteroid ephemerides in order to evaluate this accuracy. We introduced in [5] a new method for estimating the accuracy of predicted positions at any time, in particular outside the observation period. This new method is based upon a bootstrap resampling and allows this estimation with minimal assumptions. The method was applied to two of the main Saturnian satellites, Mimas and Titan, and compared with other methods used previously for asteroids. The bootstrap resampling is a robust and practical method for estimating the accuracy of predicted positions.

#### Asymptotically optimal regularization in smooth parametric models. (F. Bach, with P. Liang, G. Bouchard, M. I. Jordan)

Many types of regularization schemes have been employed in statistical learning, each one motivated by some assumption about the problem domain. In [36] , we present a unified asymptotic analysis of smooth regularizers, which allows us to see how the validity of these assumptions impacts the success of a particular regularizer. In addition, our analysis motivates an algorithm for optimizing regularization parameters, which in turn can be analyzed within our framework. We apply our analysis to several examples, including hybrid generative-discriminative learning and multi-task learning.

#### Minimax policies for adversarial and stochastic bandits (J.-Y. Audibert)

In the multi-armed bandit problem, at each stage, an agent (or decision maker) chooses one action (or arm), and receives a reward from it. The agent aims at maximizing his rewards. Since he does not know the process generating the rewards, he needs to explore (try) the different actions and yet, exploit (concentrate its draws on) the seemingly most rewarding arms. In [22] , we fill in a long open gap in the characterization of the minimax rate for the multi-armed bandit problem. Concretely, we remove an extraneous logarithmic factor in the previously known upper bound and propose a new family of randomized algorithms based on an implicit normalization, as well as a new analysis. We also consider the stochastic case, and prove that an appropriate modification of the upper confidence bound policy UCB1 (Auer et al., 2002) achieves the distribution-free optimal rate while still having a distribution-dependent rate logarithmic in the number of plays.

#### Linear least squares regression (J.-Y. Audibert)

In [49] , we consider the problem of predicting as well as the best linear combination of d given functions in least squares regression, and variants of this problem including constraints on the parameters of the linear combination. When the input distribution is known, there already exists an algorithm having an expected excess risk of order d/n , where n is the size of the training data. Without this strong assumption, standard results often contain a multiplicative logn factor, and require some additional assumptions like uniform boundedness of the d -dimensional input representation and exponential moments of the output. This work provides new risk bounds for the ridge estimator and the ordinary least squares estimator, and their variants. It also provides shrinkage procedures with convergence rate d/n (i.e., without the logarithmic factor) in expectation and in deviations, under various assumptions. The key common surprising factor of these results is the absence of exponential moment condition on the output distribution while achieving exponential deviations. All risk bounds are obtained through a PAC-Bayesian analysis on truncated differences of losses. Finally, we show that some of these results are not particular to the least squares loss, but can be generalized to similar strongly convex loss functions.

#### Change-point detection (S. Arlot, joint work with A. Celisse)

We tackle the problem of detecting abrupt changes in the mean of a heteroscedastic signal by model selection, without knowledge on the variations of the noise. A new family of change-point detection procedures is proposed, showing that cross-validation methods can be successful in the heteroscedastic framework, whereas most existing procedures are not robust to heteroscedasticity. The robustness to heteroscedasticity of the proposed procedures is supported by an extensive simulation study, together with recent theoretical results. An application to Comparative Genomic Hybridization (CGH) data is provided, showing that robustness to heteroscedasticity can indeed be required for their analysis [48] .