## Section: New Results

### Markov models

#### Variational approach for the joint estimation-detection of Brain activity from functional MRI data

Participants : Florence Forbes, Lotfi Chaari, Thomas Vincent.

**Joint work with:** Michel Dojat (Grenoble Institute of
Neuroscience) and Philippe Ciuciu from Neurospin, CEA in Saclay.

In standard fMRI within-subject analysis, two steps are generally
performed separately: detection and estimation. Because these two
steps are inherently linked, we proposed in this work a joint
detection-estimation procedure.
We adopt the so-called region-based Joint Detection Estimation
(JDE) framework that deals with spatial dependencies between
voxels belonging to the same functionally homogeneous
*parcel* in the mask of the 3D brain. After building a
spatially adaptive General Linear Model, prior information is
introduced and a hierarchical Bayesian model is established. In
contrast to previous works that use Markov Chain Monte Carlo
(MCMC) techniques to approximate the resulting intractable
posterior distribution, we recast the JDE into a missing data
framework and derive a Variational Expectation-Maximization (VEM)
algorithm for its inference. It follows a new algorithm that
exhibits interesting properties compared to the previously used
MCMC-based approach. Experiments on artificial and real data show
that VEM-JDE is robust to model mis-specification and provides
computational gain while maintaining good performance.
Corresponding papers
[27] , [38] , [26] .

#### Adaptive experimental condition selection in event-related fMRI

Participants : Florence Forbes, Christine Bakhous, Lotfi Chaari, Thomas Vincent, Thomas Vincent.

**Joint work with:** Michel Dojat (Grenoble Institute of
Neuroscience) and Philippe Ciuciu from Neurospin, CEA in Saclay..

Standard Bayesian analysis of event-related functional Magnetic Resonance Imaging (fMRI) data usually assumes that all delivered stimuli possibly generate a BOLD response everywhere in the brain although activation is likely to be induced by only some of them in specific brain areas. Criteria are not always available to select the relevant conditions or stimulus types (e.g. visual, auditory, etc.) prior to estimation and the unnecessary inclusion of the corresponding events may degrade the results. To face this issue, we propose within a Joint Detection Estimation (JDE) framework, a procedure that automatically selects the conditions according to the brain activity they elicit. It follows an improved activation detection that we illustrate on real data.

#### Finding Audio-Visual Events in Informal Social Gatherings

Participant : Florence Forbes.

**Joint work with:** Xavier Alameida-Pineda and Radu Horaud from
the INRIA Perception team.

In this work [21] we addressed the problem of detecting and localizing objects that can be both seen and heard, e.g., people. This may be solved within the framework of data clustering. We proposed a new multimodal clustering algorithm based on a Gaussian mixture model, where one of the modalities (visual data) is used to supervise the clustering process. This was made possible by mapping both modalities into the same metric space. To this end, we fully exploited the geometric and physical properties of an audio-visual sensor based on binocular vision and binaural hearing. We proposed an EM algorithm that is theoretically well justified, intuitive, and extremely efficient from a computational point of view. This efficiency makes the method implementable on advanced platforms such as humanoid robots. We described in detail tests and experiments performed with publicly available data sets that yield very interesting results.

#### Spatial risk mapping for rare disease with hidden Markov fields and variational EM

Participants : Lamiae Azizi, Florence Forbes, Senan James Doyle.

**Joint work with:** David Abrial and Myriam Garrido from INRA
Clermont-Ferrand-Theix.

We recast the disease mapping issue of automatically classifying geographical units into risk classes as a clustering task using a discrete hidden Markov model and Poisson class-dependent distributions. The designed hidden Markov prior is non standard and consists of a variation of the Potts model where the interaction parameter can depend on the risk classes. The model parameters are estimated using an EM algorithm and the mean field approximation. This provides a way to face the intractability of the standard EM in this spatial context, with a computationally efficient alternative to more intensive simulation based Monte Carlo Markov Chain (MCMC) procedures. We then focus on the issue of dealing with very low risk values and small numbers of observed cases and population sizes. We address the problem of finding good initial parameter values in this context and develop a new initialization strategy appropriate for spatial Poisson mixtures in the case of not so well separated classes as encountered in animal disease risk analysis. Using both simulated and real data, we compare this strategy to other standard strategies and show that it performs well in a lot of situations. Corresponding papers and communications [43] , [24] , [37] , [25] .

#### Probabilistic model definition for physiological state monitoring

Participants : Laure Amate, Florence Forbes.

**Joint work with:** Catherine Garbay, Julie Fontecave-Jallon
and Benoit Vettier from LIG.

Assessing the global situation of a person from physiological data is a well-known difficult problem. In previous work, we proposed a system that does not produce a diagnosis but instead follows a set of hypotheses and decides of an alarming situation with this information. In this work [22] , we focus on data processing part of the system taking into account the complexity and the ambiguity of the data. We propose a statistical approach with a global model based on Hidden Markov Model and we present data models that rely on classical physiological parameters and expert's knowledge. We then learn a model that depends on the person and its environment, and we define and compute confidence values to assess the plausibility of hypotheses.

#### Solder Paste Inspection

Participants : Florence Forbes, Senan James Doyle, Darren Wraith.

This is joint work with VI-Technology.

The majority of defects in PCB manufacture are attributed to the stencil printing process. Stencil printing is the process where *solder paste bricks*
are deposited on the PCB *pads*. Solder paste deposition is required to be accurate and repeatable, however complex physical process make this problematic.
Components are placed, and their leads are pushed into the solder paste. The solder paste is then melted using, for example, *reflow soldering*.

Inspection can be performed before the solder paste is melted, and it is more economical to identify defects at this stage.

The evaluation of solder paste joint quality involves the analysis of a number of indicative measurements. From these measurements,
potential faults are identified and inspected manually. The general challenge is to reduce of the number of potential
faults by better analyzing the indicative factor measurements. That is, to improve the *first pass yield* (FPY)
which is the percentage of total solder deposits that are good, and that do not require manual inspection.
However, the ability to catch defects must be retained. Another aspect to consider is the temporal nature of the process;
The mechanism for identifying faults needs to be retrained after a period of time, and so a solution must be capable of using a small training dataset.

It is important to understand and identify the factors that influence quality. The industry standard factor for measuring quality is solder volume. The precise volume is not directly observable, and so is estimated. Often, height is used as a proxy measure for solder bricks of equal area and shape. There are many other contributing factors, however not all of these can be measured directly, making accurate quality determination difficult.

Stencil printing process control attempts to adjust machine parameters according to informative factors. Online printing process control faces a similar challenge of using a limited number of measurements to inform on the quality of solder paste deposition.

We used statistical techniques to analyze such measurements. The exact nature of the work is confidential.

#### PCB defect detection

Participants : Florence Forbes, Kai Qin, Huu Giao Nguyen.

This is joint work with VI-Technology.

The objective is to detect defective components in PC Boards from image data. The exact nature of the work is confidential.

#### Statistical characterization of tree structures based on Markov Tree Models and multitype branching processes, with applications to tree growth modeling.

Participant : Jean-Baptiste Durand.

**Joint work with:** Pierre Fernique (Montpellier 2 University
and CIRAD) and Yann Guédon (CIRAD), INRIA Virtual Plants.

The quantity and quality of yields in fruit trees is closely related
to processes of growth and branching, which determine ultimately the
regularity of flowering and the position of flowers. Flowering and
fruiting patterns are explained by statistical dependence between
the nature of a parent shoot (*e.g* flowering or not) and the
quantity and natures of its children shoots – with potential
effect of covariates. Thus, better characterization of patterns and
dependencies is expected to lead to strategies to control the
demographic properties of the shoots (through varietal selection or crop
management policies), and thus to bring substantial improvements in
the quantity and quality of yields.

Since the connections between shoots can be represented by mathematical trees, statistical models based on multitype branching processes and Markov trees appear as a natural tool to model the dependencies of interest. Formally, the properties of a vertex are summed up using the notion of vertex state. In such models, the numbers of children in each state given the parent state are modelled through discrete multivariate distributions. Model selection procedures are necessary to specify parsimonious distributions. We developed an approach based on probabilistic graphical models to identify and exploit properties of conditional independence between numbers of children in different states, so as to simplify the specification of their joint distribution. The graph building stage was based on a Poissonian Generalized Linear Model for the contingency tables of the counts of joint children state configurations. Then, parametric families of distributions were implemented and compared statistically to provide probabilistic models compatible with the estimated independence graph.

This work was carried out in the context of Pierre Fernique's Master 2 internship (Montpellier 2 University and AgroParisTech). It was applied to model dependencies between short or long, vegetative or flowering shoots in apple trees. The results highlighted contrasted patterns related to the parent shoot state, with interpretation in terms of alternation of flowering (see paragraph 6.2.9 . This work will be continued during Pierre Fernique's PhD thesis, with extensions to other fruit tree species and other strategies to build probabilistic graphical models and parametric discrete multivariate distributions including covariates and mixed effects.

#### Statistical characterization of the alternation of flowering in fruit tree species

Participant : Jean-Baptiste Durand.

**Joint work with:** Jean Peyhardi and Yann Guédon (Mixed
Research Unit DAP, Virtual Plants team), Evelyne Costes and
Baptiste Guitton (DAP, AFEF team), Catherine Trottier (Montpellier
University)

The aim of this work was to characterize genetic determinisms of the alternation of flowering in apple tree progenies. Data were collected at two scales: at whole tree scale (with annual time step) and a local scale (annual shoot or AS, which is the portions of stem that were grown during the same year). Two replications of each genotype were available.

To model alternation of flowering at AS scale, a second-order Markov tree model was built. The ASs were of two types: flowering or vegetative. Generalized Linear Mixed Models (GLMMs) were used to model the effet of year, replications and genotypes (with their interactions with year or memories of the Markov model) on the transition probabilities. This work was the continuation of the Master 2 internship of Jean Peyhardi (Bordeaux 2 University) and was carried out in the context of the PhD thesis of Baptiste Guitton.

This PhD thesis also comprised the study of alternation in flowering at individual scale, with annual time step. To relate alternation of flowering at AS and individual scales, indices were proposed to characterize alternation at individual scale. The difficulty is related to early detection of alternating genotypes, in a context where alternation is often concealed by a substantial increase of the number of flowers over consecutive years. To separate correctly the increase of the number of flowers due to aging of young trees from alternation in flowering, our model relied on a parametric hypothesis on the base effect random slopes specific to genotype and replications), which translated into mixed effect modelling. Different indices of alternation were then computed on the residuals. Clusters of individuals with contrasted patterns of bearing habits were identified. Our models highlighted significant correlations between indices of alternation at AS and individual scales. The roles of local alternation and asynchronism in regularity of flowering were assessed using an entropy-based criterion, which characterized asynchronism.

As a perspective of this work, patterns in the production of children ASs (numbers of flowering and vegetative children) depending on the type of the parent AS must be analyzed using branching processes and different types of Markov trees, in the context of Pierre Fernique's PhD Thesis (see paragraph 6.2.8 ).