Team dionysos

Overall Objectives
Scientific Foundations
Application Domains
New Results
Contracts and Grants with Industry
Other Grants and Activities

Section: New Results

Dependability and extensions

Participants : Raymond Marie, Gerardo Rubino, Samira Saggadi, Bruno Sericola, Bruno Tuffin.

We maintain a research activity in different areas related to dependability, performability and vulnerability analysis of communication systems. In 2010 our focus has been on evaluation techniques using both the Monte Carlo and the Quasi-Monte Carlo approaches. Monte Carlo (and Quasi-Monte Carlo) methods often represent the single tool to solve complex problems, and rare event simulation requires a special attention, to accelerate the occurrence of the event and get an unbiased estimator with a sufficiently small relative variance.

Those issues are summarized in the book on Monte Carlo simulation, addressed to graduate students or practitionners, [74] . It presents all the basic notions from random number generation output analysis, variance reduction techniques and shows how they can be applied to the computation of integrals or sums, and to solve equations or optimization problems.

Novel results in simulation can be decomposed into two subsets: results on rare event simulation, and those on Randomized Quasi-Monte Carlo methods. Randomized quasi-Monte Carlo (RQMC) methods estimate the expectation of a random variable by the average of n dependent realizations of it. In general, due to the strong dependence used, the estimation error may not obey a central limit theorem. Analysis of RQMC methods have so far focused mostly on the convergence rates of asymptotic worst-case error bounds and variance bounds, when n tends to infinity, but little is known about the limiting distribution of the error. We have analyzed that asymptotic distribution in [18] for the special case of a randomly-shifted lattice rule, when the integrand is smooth. We show that for the simple case of one-dimensional integrands, the limiting error distribution is uniform over a bounded interval if the integrand is non-periodic, and has a square root form over a bounded interval if the integrand is periodic. We find that in higher dimensions, there is little hope to precisely characterize the limiting distribution in a useful way for computing confidence intervals in the general case. We nevertheless examine how this error behaves as a function of the random shift from different perspectives and on various examples. We also point out a situation where a classical central-limit theorem holds when the dimension goes to infinity, we provide guidelines on when the error distribution should not be too far from normal, and we examine how far from normal is the error distribution in examples inspired from real-life applications.

Most of our simulation activity has nevertheless been on rare event simulation. See [1] , edited by G. Rubino and B. Tuffin, where a rather complete overview of the domain and a state-of-the-art about the research in most of its subfileds are presented. In [17] , we have discussed the importance of designing estimators that stay efficient as the probability of the considered event decreases to zero. While robustness properties generally look at the second moment only, we discuss the importance of investigating higher order moments, and define related properties. For the adaptive estimators, for which the parameters are therefore random, one cannot strictly guarantee robustness properties, but statistical guarantees can be provided, as highlighted in [66] .

The typical application of rare event simulation we have used is the evaluation of the probability that a graph is disconnected. We propose in [19] a new Monte Carlo method, based on dynamic importance sampling, to estimate this probability. The method generates the link states one by one, using a sampling strategy that approximates an ideal zero-variance importance sampling scheme. The approximation is based on minimal cuts in subgraphs. In an asymptotic rare-event regime where failure probability becomes very small, we prove that the relative error of our estimator remains bounded, and even converges to 0 under additional conditions, when the unreliability of individual links converges to 0, a property not satisfied by previous estimators. In [60] , this method is combined with a Recursive Variance Reduction (RVR) estimator which approaches the unreliability by recursively reducing the graph from the random choice of the first working link on selected cuts. In [71] , a conditional Monte Carlo method taking into account the expected number of samples over which a prespecified set of disjoint minpaths linking the set of nodes fails is analyzed, and combined with quasi-Monte Carlo methods. The combination with approximate zero-variance importance sampling is done in [35] , where we derive asymptotic robustness properties of the resulting estimator when reliabilities of individual links go arbitrarily close to one, and we illustrate numerically the gain that can be obtained on large graphs. In [61] , we explore the capability of the Importance Splitting approach to attack these static problems. Important Splitting is a generic technique designed to estimate rare event probabilities defined on stochastic processes. In [61] , we adapt the method to the static context and discuss an idea for estimating network reliability metrics.

We worked on the risk on spares for life-time maintenance purposes due to uncertainties on the mean up time. More precisely, we consider the case of a mono-production of a small quantity of complex (and expensive) systems, where we have to determine, at the moment when the global system is produced, the quantity of spares we want to produce for life-time maintenance purposes. The present difficulty comes from the fact that we assume that the steady state mean up time is not known precisely but is considered as uniformly distributed on a time interval [a, b] . We were able to exhibit the formal expression of the probability distribution of the spare consumption for a given duration of the life-time. A first paper was presented in a conference ([45] ) and an extended version was published in [22] .

Another study focuses on the case where, inside a maintenance organization, the repairman has to go to a distant operational site with a collection of spare items of different types due to lack of information. At the end of his maintenance intervention, the unused spares are put back on the shelves. Because the intervention takes a significant amount of time (the travel time is important), we show that this situation where the spares are borrowed plays a significant part in the value of the spare shortage probability. We present in [56] a detailed model that allows us to take this situation into account. An attempt to produce a simpler model has resulted in a heuristic which can be recommended as a substitution for the detailed model only when using highly time consuming iterative optimization processes.

The determination of the mission time availability when identical items share a single spare inventory has been the object of one of our studies [55] . The exchange time is neglected and the eventual unavailability is only due to the lack of spares before the end of the mission. the lifetimes of the components are supposed to follow an exponential probability distribution. First, semi-formal expressions of state probabilities of different Boolean variables are exhibited thanks to efficient recurrent procedures. Then the value of mission time availability is obtained by using graph manipulations initiated by an ordered Shannon tree (binary decision tree).

Another study [54] exposes a resolution process for the operational availability evaluation in the case when there is a commutation time for redundant system with different elements. This resolution must be analytic in order to include this solution into software which uses iterative process for optimization, like spare optimization with availability target. So far, we have been able to obtain a very satisfactory approximation for the case of redundancy (n-1)/n . But future work is needed for le general case k/n .

We obtained in [15] an improvement of an algorithm we developed a few years ago for the distribuition computation of the cumulative reward in Markov chains.


Logo Inria