## Section: New Results

### Asymptotic behavior of bifurcating autoregressive processes with missing data

Participants : Benoîte de Saporta, Anne Gégout-Petit.

Bifurcating autoregressive processes (BAR) generalize autoregressive (AR) processes, when the data have a binary tree structure. Typically, they are involved in modeling cell lineage data, since each cell in one generation gives birth to two offspring in the next one. Cell lineage data usually consist of observations of some quantitative characteristic of the cells, over several generations descended from an initial cell.
Recently, experiments made by biologists on aging of
*Escherichia coli*, see [60] , motivated mathematical
and statistical studies of the asymmetric BAR process, that is when
the quantitative characteristics of the even and odd sisters are
allowed to depend from their mother's through different sets of
parameters.
The originality of this work is that we take into account possibly missing data in the estimation procedure of the parameters of the asymmetric BAR process.
This is a problem of practical interest, as experimental data are often incomplete, either because some cells died, or because the measurement of the characteristic under study was impossible or faulty. For instance, among the 94 colonies studied in [60] , only two data sets are complete, with respectively 2 and 6 generations. In average over the 94 colonies dividing up to 9 times, there are about 23% of missing data. It is important to take this phenomenon into account.

The *naive* approach to handle missing data would be to replace the sums over all data in the estimators by sums over the observed data only. Our approach is slightly more subtle. We propose a structure for the observed data based on a two-type Galton-Watson process consistent with the possibly asymmetric structure of the BAR process. Basically, the probability to observe a cell depends on the type of both this cell and its mother. We propose an estimation procedure of the parameters of the asymmetric BAR process in this context, and prove sharp results of convergence and asymptotic normality for our estimators.

This work is in collaboration with Laurence Marsalle of Lille 1 university. It has been presented at the *Journées de Statistique* in Marseille in may [33] and in the seminar of the team probability and statistics of Bordeaux and is submitted for publication (see http://hal.inria.fr/inria-00494793/en/ ).