Team mistis

Overall Objectives
Scientific Foundations
Application Domains
New Results
Contracts and Grants with Industry
Other Grants and Activities

Section: New Results

Semi and non-parametric methods

Modelling extremal events

Participants : Stéphane Girard, Laurent Gardes.

Joint work with: Guillou, A. (Univ. Strasbourg)

We introduced of a new model of tail distributions depending on two parameters $ \tau$$ \in$[0, 1] and $ \theta$>0   [55] . This model includes very different distribution tail behaviors from Fréchet et Gumbel maximum domains of attraction. In the particular cases of Pareto type tails ($ \tau$ = 1 ) or Weibull tails ($ \tau$ = 0 ), our estimators coincide with classical ones proposed in the literature, thus permitting to retrieve their asymptotic normality in an unified way.

Conditional extremal events

Participants : Stéphane Girard, Laurent Gardes, Alexandre Lekina, Eugen Ursu.

Joint work with: Amblard, C. (TimB in TIMC laboratory, Univ. Grenoble 1).

The goal of the PhD thesis of Alexandre Lekina is to contribute to the development of theoretical and algorithmic models to tackle conditional extreme value analysis, ie the situation where some covariate information X is recorded simultaneously with a quantity of interest Y . In such a case, the tail heaviness of Y depends on X, and thus the tail index as well as the extreme quantiles are also functions of the covariate. We combine nonparametric smoothing techniques  [51] with extreme-value methods in order to obtain efficient estimators of the conditional tail index [9] and conditional extreme quantiles  [56] . Conditional extremes are studied in climatology where one is interested in how climate change over years might affect extreme temperatures or rainfalls. In this case, the covariate is univariate (the time). Bivariate examples include the study of extreme rainfalls as a function of the geographical location. The application part of the study is joint work with the LTHE (Laboratoire d'étude des Transferts en Hydrologie et Environnement) located in Grenoble. The obtained results are submitted for publication  [54] .

More future work will include the study of multivariate and spatial extreme values. To this aim, a research on some particular copulas [1][11] has been initiated with Cécile Amblard, since they are the key tool for building multivariate distributions  [61] .

Level sets estimation

Participants : Stéphane Girard, Laurent Gardes.

Joint work with: Daouia, A. (Univ. Toulouse I) and Jacob, P. (Univ. Montpellier II).

The boundary bounding the set of points is viewed as the larger level set of the points distribution. This is then an extreme quantile curve estimation problem. We propose estimators based on projection as well as on kernel regression methods applied on the extreme values set, for particular set of points. Our work is to define similar methods based on wavelets expansions in order to estimate non-smooth boundaries, and on local polynomials [17] estimators to get rid of boundary effects. Besides, we are also working on the extension of our results to more general sets of points. To this end, we focus on the family of conditional heavy tails. An estimator of the conditional tail index has been proposed [9] and the corresponding conditional extreme quantile estimator has been derived  [56] in a fixed design setting. The extension to the random design framework is investigated in  [49] . This work has been initiated in the PhD work of Laurent Gardes  [53] , co-directed by Pierre Jacob and Stéphane Girard.

Dimension reduction

Participants : Stéphane Girard, Laurent Gardes, Mathieu Fauvel.

To overcome the curse of dimensionality arising in high-dimensional regression problems, one way consists in reducing the problem dimension. To this end, Sliced Inverse Regression (SIR) is an interesting solution. The original method, however, requires the inversion of the predictors covariance matrix. In case of collinearity between these predictors or small sample sizes compared to the dimension, the inversion is not possible and a regularization technique has to be used. We thus develop a new approach [13] based on a Fisher Lecture given by R.D. Cook where it is shown that SIR axes can be interpreted as solutions of an inverse regression problem. In this paper, a Gaussian prior distribution is introduced on the unknown parameters of the inverse regression problem in order to regularize their estimation. We show that some existing SIR regularizations can enter our framework, which permits a global understanding of these methods. Three new priors are proposed leading to new regularizations of the SIR method. Results are compared with the Support Vector Machine (SVM) approach on hyperspectral data [12] .

Nuclear plants reliability

Participants : Laurent Gardes, Stéphane Girard.

Joint work with: Perot, N., Devictor, N. and Marquès, M. (CEA).

One of the main activities of the LCFR (Laboratoire de Conduite et Fiabilité des Réacteurs), CEA Cadarache, concerns the probabilistic analysis of some processes using reliability and statistical methods. In this context, probabilistic modelling of steels tenacity in nuclear plants tanks has been developed. The databases under consideration include hundreds of data indexed by temperature, so that, reliable probabilistic models have been obtained for the central part of the distribution. However, in this reliability problem, the key point is to investigate the behavior of the model in the distribution tail. In particular, we are mainly interested in studying the lowest tenacities when the temperature varies (Figure 7 ).

Figure 7. Tenacity as a function of the temperature.

This work is supported by a research contract (from December 2008 to December 2010) involving mistis and the LCFR.

Quantifying uncertainties on extreme rainfall estimations

Participants : Eugen Ursu, Laurent Gardes, Stéphane Girard.

Joint work with: Molinié, G. from Laboratoire d'Etude des Transferts en Hydrologie et Environnement (LTHE), France.

Extreme rainfalls are generally associated with two different precipitation regimes. Extreme cumulated rainfall over 24 hours results from stratiform clouds on which the relief forcing is of primary importance. Extreme rainfall rates are defined as rainfall rates with low probability of occurrence, typically with higher mean return-levels than the maximum observed level. For example Figure 8 presents the return levels for the Cévennes-Vivarais region. It is then of primary importance to study the sensitivity of the extreme rainfall estimation to the estimation method considered. A preliminary work on this topic is available in  [54] . mistis got a Ministry grant for a related ANR project (see Section  8.2 ).

Figure 8. Map of the mean return-levels (in mm) for a period of 10 years.

Retrieval of Mars surface physical properties from OMEGA hyperspectral images.

Participants : Mathieu Fauvel, Laurent Gardes, Stéphane Girard.

Joint work with: Douté, S. from Laboratoire de Planétologie de Grenoble, France in the context of the VAHINE project (see Section 8.2 ).

Visible and near infrared imaging spectroscopy is one of the key techniques to detect, to map and to characterize mineral and volatile (eg. water-ice) species existing at the surface of the planets. Indeed the chemical composition, granularity, texture, physical state, etc. of the materials determine the existence and morphology of the absorption bands. The resulting spectra contain therefore very useful information. Current imaging spectrometers provide data organized as three dimensional hyperspectral images: two spatial dimensions and one spectral dimension. Our goal is to estimate the functional relationship F between some observed spectra and some physical parameters. To this end, a database of synthetic spectra is generated by a physical radiative transfer model and used to estimate F . The high dimension of spectra is reduced by Gaussian regularized sliced inverse regression (GRSIR) to overcome the curse of dimensionality and consequently the sensitivity of the inversion to noise (ill-conditioned problems). This method is compared with the more classical SVM approach. GRSIR has the advantage of being very fast, interpretable and accurate [12] . Recall that SVM approximates the functional F : y = F(x) using a solution of the form Im5 ${F{(x)}~=~\#8721 _{i=1}^n\#945 _i~K{(x,~x_i)}~+~b}$ , where xi are samples from the training set, K a kernel function and Im6 $\mfenced o=( c=) {(\#945 _i)}_{i=1}^n,~b$ are the parameters of F which are estimated during the training process. The kernel K is used to produce a non-linear function. The SVM training entails minimization of Im7 $\mfenced o=[ c=] \mfrac 1n\#8721 _{i=1}^\#8467 l\mfenced o=( c=) F{(x_i)},y_i+\#955 {\#8741 F\#8741 }^2$ with respect to Im6 $\mfenced o=( c=) {(\#945 _i)}_{i=1}^n,~b$ , and with Im8 ${l\mfenced o=( c=) F(x),y=0}$ if |F(x)-y|$ \le$$ \epsilon$ and |F(x)-y|-$ \epsilon$ otherwise. Prior to running the algorithm, the following parameters need to be fitted: $ \epsilon$ which controls the resolution of the estimation, $ \lambda$ which controls the smoothness of the solution and the kernel parameters ($ \gamma$ for the Gaussian kernel).

Statistical analysis of hyperspectral multi-angular data from Mars

Participants : Mathieu Fauvel, Florence Forbes, Laurent Gardes, Stéphane Girard.

Joint work with: Douté, S. from Laboratoire de Planétologie de Grenoble, France in the context of the VAHINE project (see Section 8.2 ).

A new generation of imaging spectrometers is emerging with an additional angular dimension, in addition to the three usual dimensions, two spatial dimensions and one spectral dimension. The surface of the planets will now be observed from different view points on the satellite trajectory, corresponding to about ten different angles, instead of only one corresponding usually to the vertical (0 degree angle) view point. Multi-angle imaging spectrometers present several advantages: the influence of the atmosphere on the signal can be better identified and separated from the surface signal on focus, the shape and size of the surface components and the surfaces granularity can be better characterized. However, this new generation of spectrometers also results in a significant increase in the size (several tera-bits expected) and complexity of the generated data. To investigate the use of statistical techniques to deal with these generic sources of complexity, we made preliminary experiments using our HDDC technique on a first set of realistic synthetic 4D spectral data provided by our collaborators from LPG. It appeared that this data set was not relevant for our study due to the fact that the simulated angular information provided was not discriminant and could not allow us to draw useful conclusions. Further experiments on other data sets are then necessary.


Logo Inria