Section: New Results
Probabilistic models
Markov random field models of shape
Keywords : Markov random field, phase field, shape, tree crown.
Participant : Ian H. Jermyn [ contact ] .
This work was performed in collaboration with Mr. Tamas Blaskovics and Professor Zoltan Kato of the University of Szeged, Hungary.
The phase field higherorder active contour framework for shape modelling developed [3] in Ariana lends itself to a probabilistic interpretation, the phase field energies being taken as the Gibbs energies of a Markov random field (MRF). This opens the way to parameter and model estimation, stochastic algorithms, and much else. As described in [13] , starting from the continuum phase field energy, both the domain and codomain of the phase field must be discretized. Taking advantage of the fact that the phase field assumes the values ±1 except near region boundaries, the codomain can be discretized to binary values. When coupled with spatial discretization on a lattice, a binary Markov random field is produced; standard techniques, such as Gibbs sampling and simulated annealing, can then be applied. Approximate relations between the parameters of the phase field model and those of the binary field are also available, meaning, in particular, that parameter ranges leading to particular shapes being stable, deduced from stability analyses of the continuum model, can be carried over to the MRF.
Figure 6 shows the interaction kernel of the MRF, some representative samples, and an application of the MRF model resulting from fixing the parameters to allow for stable circles (the `gas of circles' model) to the problem of tree crown extraction from an aerial image. The algorithm used was simulated annealing with a Gibbs sampler.
Optimization algorithms for Markov random field models of shape
Keywords : Markov random field, phase field, SwendsenWang, graph cut, QPBO.
Participants : Antoine Labatie, Ian H. Jermyn [ contact ] , Josiane Zerubia.
The MRFs constructed in [13] are not very tractable theoretically or algorithmically, because they are highly frustrated. Nevertheless, it is necessary to sample from these fields, and to minimize their energies, efficiently. The work summarized here involved comparing the performance of a number of sampling and optimization algorithms applied to these MRFs.
Cluster algorithms were compared to Gibbs sampling, as used in [13] . The standard versions of these algorithms [43] , [44] do not apply to frustrated models, so instead the generalized clustering algorithm proposed in [39] was tested. Surprisingly, this algorithm performed badly, being very slow to converge and tending to cluster the whole image. Thus although theoretically sound, in practice this algorithm is not useful for the MRF models used here. The development of efficient sampling algorithms that better incorporate the correlation structure is the next step.
For energy minimization, simulated annealing and quadratic pseudoboolean optimization (QPBO) [40] , [41] were compared. Frustration means that graph cuts cannot find the global energy minimum; QPBO is an alternative for general binary energies. It labels a subset of lattice nodes, these being guaranteed to form part of the global minimum. If the number of unlabelled nodes is small, their labels can be found, e.g. by simulated annealing. QPBO worked well for the MRFs used here, but its memory requirements render it impractical for large images; on the other hand, it was faster than simulated annealing. The quality of the results did not vary much between these algorithms, and were similar to those obtained using gradient descent with the phase field model.
One interesting observation is that, in common with algorithms for other NPhard problems, QPBO's performance undergoes a phase transition, as a function of degree of frustration relative to external field strength. This is true even when the external field is constant, and when the zerofield ground state is constant. This means that QPBO is of no use in finding the ground states of frustrated systems. This will be studied in detail in future work.
Shape space approach to curves and surfaces
Keywords : curve, surface, shape, invariance, equivariance, distance, metric, Riemannian, coordinates.
Participant : Ian H. Jermyn [ contact ] .
This work was performed in collaboration with Professors Anuj Srivastava and Eric Klassen of Florida State University, USA. It was partly funded by INRIA Associated Team SHAPES.
The `shape space' approach to curves and surfaces embedded in constructs a `shape space' S from a space of parameterized geometrical objects C by quotienting by a group G of `shapepreserving' transformations. An equivariant Riemannian metric g on C is pushed down to S and used to measure shape similarity. Two key questions in this framework are: what should g be and what coordinates on C simplify the form of this g as much as possible?
Regarding g , there is now a body of work that uses a oneparameter family of metrics G_{c} , , on C known as the `elastic metric'. This metric, which measures both changes in curve orientation and stretching, is suitable for many applications. In previous work [45] on the case n = 2 , was singled out as special because it was amenable to analytical treatment. This does not generalize to n>2 , however, and in this work [7] we have elucidated the reason as part of a general answer to the second question. The results are as follows. For n = 2 , for all c1 , the Riemannian curvature of C is zero except at the origin, where it is singular. When c = 1 , the singularity disappears and C is isometric to . For n>2 , the situation is quite different. For c1 , the curvature of C is nowhere zero, and is again singular at the origin. This explains the failure of [45] to generalize to higher dimensions. On the other hand, for c = 1 , C is isometric to for all n . The choice c = 1 thus leads to simplified calculations for any n , and has been applied, for example, in [7] to the extract of shapes from 2d point clouds, and to modelling the shapes of protein backbones. Figure 7 shows an example of clustering of curves in using this metric.
For surfaces, the situation is more difficult because of the complexity of Diff (S^{2}) . The generalization of the orientation term in the elastic metric is straightforward, but for the stretching term there are several possibilities. One involves using the unique oneparameter family of ultralocal metrics on the space of Riemannian metrics on a manifold. The next question is which coordinates to use to express this metric.

Shape descriptors based on shape entropy
Keywords : shape, classification.
Participant : Xavier Descombes [ contact ] .
This study was supported by INRIA Associated Team ODESSA. It was conducted in collaboration with Serguei Komech, IIPT Moscow (Russian Academy of Sciences).
In this work, we address shape classification. A shape is a convex bounded set in . We consider a basic descriptor _{0}(S) defined as the ratio of the volume of the neighbourhood of the shape to the shape volume. The initial shape is then transformed by a map, parameterized by an angle , which extends the shape along the direction by a factor and contracts the shape along the orthogonal direction by a factor . We thus obtain a function for our descriptor (S, ) (see figure 8 ). A rotation of the shape is equivalent to a shift of this curve. A reflection of the shape leads to a reflection of the curve. To obtain invariance with respect to rotation and reflection, we consider the following metric:
We have tested this metric for shape retrieval using the Kimia database. The results are convincing for discriminating complex shapes.
Shape recognition for image scene analysis
Keywords : object extraction, shape prior, marked point process, active contour, birthanddeath dynamics.
Participants : Maria S. Kulikova, Ian H. Jermyn [ contact ] , Xavier Descombes, Josiane Zerubia.
This Ph.D. was funded by a MESR and also supported by INRIA/FSU associated team `SHAPES' [http://wwwsop.inria.fr/ariana/Projets/Shapes/ ], and INRIA/IITP/UIIP associated team `ODESSA' [http://wwwsop.inria.fr/ariana/Projets/Odessa/ ]
Object detection from optical satellite and aerial images is one of the most important tasks in remote sensing image analysis. The problem arises in many applications, both civilian and military. Nowadays the resolution of aerial images ranges from several tens of centimetres down to several centimetres. At these resolutions, the geometry of objects is clearly visible, and needs to be taken into account in analysing the images.
Stochastic Marked Point Processes (MPP) are known for their ability to include this type of information. Previously, MPP models have been successfully used for object extraction from images of lower resolution, where the objects have simplified geometries and were thus represented using simply shape objects, e.g. discs, ellipses, or rectangles.
The aim of our work [22] is to lift this restriction, i.e. to define a marked point process model in a space of arbitrarilyshaped objects, without increasing the dimension of the space of a single object. We represent a single object by its boundary, a closed curve. The set of possible single objects (i.e. boundaries) is defined by the local minima of an active contour energy consisting of an image term and, initially, a weak prior model of the shape of the objects (a boundary smoothing term). In a second model, strong prior shape knowledge is incorporated into the energy.
Figure 9 demonstrates the first results of object extraction using three different models. The top right image shows the configuration obtained using simplyshaped objects, in this case ellipses. The bottom left image shows the configuration obtained using the extended MPP model for arbitrarilyshaped object extraction, but with only weak shape information. The bottom right image shows the result obtained using the extended model with strong prior shape information included.
Tropical forest resource assessment using marked point processes
Keywords : Marked point processes, simulated annealing, tree detection.
Participants : Virginie Journo, Xavier Descombes, Josiane Zerubia [ contact ] .
This internship was funded by an AAP INRA INRIA contract. It was conducted in collaboration with Pierre Couteron and Christophe Proisy from UMR AMAP/IRD, Montpellier.
In this work, we evaluated the performance of marked point processes for detecting individual trees in highly dense tropical forests. We have considered a multiple birthanddeath dynamics. The model was previously defined for evaluating poplar plantations. It consists of a prior avoiding overlapping objects and a data term based on a radiometric distance between pixels inside the detected crowns and pixels surrounding them. We have evaluated two kinds of data: panchromatic IKONOS images and LIDAR data. The two sets of results are consistent. With both panchromatic and LIDAR data, detection is accurate on the tallest trees defining the canopy, while performance decreases for shorter trees. We are currently working on a fusion process to use the information from both sensors simultaneously. In addition, we evaluated the results with respect to ground truth.
Stochastic modelling of 3D objects applied to penguin detection and counting
Keywords : 3D object detection, Markov marked point process, multiple birthanddeath dynamics, parallelization.
Participants : Ahmed GamalEldin, Xavier Descombes, Josiane Zerubia [ contact ] .
This work is being performed in collaboration with Dr. Michel GauthierClerc from La Tour du Valat [http://www.tourduvalat.org/ ] and Prof. Yvon Le Maho from Institut Hubert Curien [http://iphc.in2p3.fr/ ].
This work addresses the problem of 3D object extraction from handheld camera images, so we face all the associated effects: object occlusion, shadows, depth, etc. We propose to solve the problem in an objectbased framework, i.e. using Markov marked point processes (MMPP). We are applying our model at first to almost rigid objects, which represent the penguins. We run the tests on images of penguin colonies provided by our ecologist collaborators. The MMPP is based on defining a configuration space of our objects of interest, to which is attached a Gibbs energy function. Minimizing this energy function corresponds to detecting the objects of interest. We use the recently developed Multiple Birth and Death dynamics for the energy minimization because of its convergence speed. Our work is based on simulating images of the 3D scene containing our objects (using OpenGL), which by projection give images that we compare with the original image. We have developed two new data terms based on the histogram and correlogram distances. Due to the computational complexity of dealing with 3D objects, we are investigating a parallel algorithm based on graph theory so that we can take advantage of multicore architectures. The projection parameters are a key element of the image simulation step, and have to be estimated from a single image. We show in figure 10 some preliminary results on simulated images. This work will be used in the future to detect and count penguins in Antartica.
Recovering perspective information from uniformly distributed features
Keywords : camera pose estimation, shape from texture, Poisson point process, PoissonVoronoi tessellation.
Participants : Giovanni Gherdovich, Xavier Descombes [ contact ] .
By `camera pose estimation' we mean the problem of determining the position and orientation of a camera with respect to the coordinate frame of the imaged scene. When acquiring this information from external instruments is too expensive for the application of interest, or is simply impossible because the picture is already taken, one must resort to computer vision techniques and use as best as possible the visual data available.
When the images contain artifacts like buildings or other nonnatural structures, classical approaches find straight lines, known angles, orthogonalities, or reference points, and use them to invert partially or totally the perspective distortion. Images of natural scenes do not offer such references, and other approaches need to be investigated.
We assume a set of features to be distributed on a planar surface (the world plane) as a Poisson point process, and to know their positions in the image plane. Then we propose an algorithm to recover the pose of the camera, in the case of two degrees of freedom (slant angle and distance from the ground). The algorithm is based on the observation that cell areas of the Voronoi tessellation generated by the points in the image plane represent a reliable sampling of the Jacobian determinant of the perspective transformation up to a scaling factor, the density of points in the world plane, which we demand as input. In the process, we develop a transformation of our input data (areas of Voronoi cells) so that they show almost constant variance between locations, and find analytically a correcting factor to considerably reduce the bias of our estimates. We perform intensive synthetic simulations and show that with a few hundred random points the errors in angle and distance are not more than a few percent. This work will be used in the future to detect and count penguins in Antartica.
3D building reconstruction by solving the direct problem
Keywords : birth and death dynamics, 3D reconstruction, building.
Participant : Xavier Descombes [ contact ] .
This study was supported by INRIA Associated Team ODESSA and an ECONET project. It was conducted in collaboration with P. Lukashevich, A. Krauchonak, and B. Zalesky from the UIIP in Minsk, E. Zhizhina from the IITP in Moscow, and J.D. Durou from IRIT in Toulouse.
In this project, we aim at reconstructing buildings in 3D from one or several aerial or high resolution satellite images. The main idea is to avoid solving the socalled inverse problem. We will simulate configurations of buildings and test them with respect to the data. The generation of configurations will be performed using multiple birthanddeath dynamics [2] . A Gibbs point process is defined including prior information about building configurations. To define the data term, the building configuration is projected into the data plane(s), using models of shading and shadows. This projection is performed using OpenGL for a fast 2D rendering of the scene. The data term is based on the consistency of shadows in the image and on the configuration projection in the image plane, whereas the prior penalizes building overlaps. The preliminary results are encouraging. The next steps consist of refining the data model by embedding information about gradients, and improving the convergence speed by defining proper birth maps for generating new buildings.
3D reconstruction of urban areas from optical and LIDAR data
Keywords : 3D reconstruction, stereovision, LIDAR, satellite imaging, building.
Participants : Florent Lafarge [ contact ] , Josiane Zerubia.
The generation of 3D representations of urban environments [23] from aerial and satellite data is a topic of growing interest in image processing and computer vision. Such environments are helpful in fields including urban planning, wireless communications, disaster recovery, navigation aids, and computer games. Laser scans have become more popular than multiview aerial/satellite images thanks to the accuracy of their measurements and the decrease in the cost of their acquisition. In particular, fullwaveform topographic LIDAR constitutes a new kind of laser technology providing interesting information for urban scene analysis [24] .
We study new stochastic models for analysing urban areas from optical and LIDAR data. We aim to construct concrete solutions to both urban object classification (i.e. detecting buildings, vegetation, etc.) and the 3D reconstruction of these objects. Probabilistic tools are well adapted to handling such urban objects, which may differ significantly in terms of complexity, diversity, and density within the same scene. In particular, JumpDiffusion based samplers offer interesting perspectives for modelling complex interactions between the various urban objects.
Detection of changes in building footprints using a Marked Point Process
Keywords : building extraction, change detection, marked point process, multiple birthanddeath dynamics.
Participants : Csaba Benedek, Xavier Descombes, Josiane Zerubia [ contact ] .
In this work [34] , we introduced a new probabilistic model to integrate building extraction with change detection in remotely sensed image pairs. A global optimization process attempts to find the optimal configuration of buildings, given the observed data, prior knowledge, and interactions between the neighbouring building parts. The methodological contributions can be divided into three key issues. First, we implemented a novel objectchange modelling approach based on Multitemporal Marked Point Processes, which simultaneously exploits low level change information between the time layers and an objectlevel building description to recognize and separate changed and unaltered buildings. Second, answering the challenges of data heterogeneity in aerial and satellite image repositories, we have constructed a flexible hierarchical framework which can create various building appearance models from different elementary feature based modules. Third, to achieve convergence and optimality while at the same time respecting the computational complexity constraints raised by increased data volume , we have adopted the fast Multiple Birth and Death optimization technique for change detection purposes, and proposed a novel nonuniform stochastic object birth process, which generates relevant objects with higher probability based on lowlevel image features.
Parameter estimation for marked point processes: application to object extraction from remote sensing images
Keywords : Marked point processes, RJMCMC, simulated annealing, SEM, pseudolikelihood, shape extraction.
Participants : Saima Ben Hadj, Xavier Descombes [ contact ] , Josiane Zerubia.
This internship was funded by a French Space Agency (CNES)contract
In order to extract object networks from high resolution remote sensing images, we previously defined a stochastic marked point process model. This model represents a random set of objects identified jointly by their position in the image and their geometric characteristics. It includes a prior energy term that corresponds to the penalization of overlapping objects (hardcore process) and a data energy term that quantifies the fit of the process objects to the image. To adapt the proposed model to any image, we introduce an energy weight which must be adjusted according to the processed image. We therefore studied methods for the estimation of the energy weight parameter of the proposed model. A method based on the SEM algorithm (Stochastic ExpectationMaximization) was proposed [15] , the process likelihood being approximated by the pseudolikelihood. This method has shown good performance on a simple model of point processes where the objects are circular.
In our work we generalize this estimation procedure to the case of more general geometrical shape models. We introduced an ellipse process to extract tree crowns and flamingos. The simulation of the proposed algorithm showed the relevance of the adopted model. We then introduced a rectangle process for extracting building footprints and tents. In addition, we introduced a more complex prior for rectangle alignment to improve the accuracy of the results (the buildings of major cities are usually aligned). Figures 12 and 13 show the extraction results for different object shapes.
Geometric Feature Extraction for Hierarchical Image Modelling
Keywords : aerial imagery, satellite imagery, image segmentation, Markov process, marked point process hierarchical models.
Participants : Raffaele Gaetano [ contact ] , Josiane Zerubia.
This work was done in collaboration with Dr Giuseppe Sacarpa and Prof. Gianni Poggi from University of Naples.
The ideas for the work introduced here arise from a former image modelling framework based on the Hierarchical Multiple Markov Chain model (HMMC) [6] , and its related unsupervised algorithm for hierarchical texturebased segmentation named Texture Fragmentation and Reconstruction (TFR). The TFR algorithms aims to detect the different textures in an image using a fragmentation and reconstruction approach: in the first stages of the algorithm, the image is oversegmented to provide a set of elementary regions, whose connected components are described using spectral and contextual features and clustered to form basic texture patterns. Later on, the resulting regions are sequentially merged, according to a suitable metric, to compose texture patterns at higher scales and contextually retrieve the underlying hierarchical image structure.
The TFR algorithm mainly uses spectral features to provide fine segmentation and to describe interactions among image regions. In many real world applications, where the focus on spectral properties can strongly limit the completeness of scene description, the TFR algorithm shows its major drawbacks: in particular, this is the case with satellite/aerial images containing a significant quantity of manmade structures (buildings, roads, fields, etc.).
Based on this observation, a new study has begun to include geometrical features in the TFR framework, while keeping the complexity of the overall technique low. The key idea is to provide a geometrical characterization of image elements by means of regular elementary shapes such as curves, rectangles, and ellipses. To this end, we resorted to a simplified marked point process framework that applies to segmentation maps rather than dealing directly with image data.
In this first stage of the work, a fast algorithm for feature extraction based on elementary shapes has been developed: starting from a segmentation map, each connected component is independently fitted to an elementary shape, by applying a marked point process that uses a simple data likelihood function, weighting the tradeoff between the uniformity of the interior and the diversity of the exterior labels. To limit the complexity of the process, a prior morphological analysis is performed on the initial map: the skeletons of the connected components are extracted, and used to estimate the principal dimensions and orientation of the reference shape. Its centre is also estimated by computing an approximation of the geodesic centre of the original shape. This information is then used to reduce the range over which the marks need be defined, thus speeding up the process.
High resolution SAR image classification
Keywords : synthetic aperture radar (SAR), classification, copula, parametric estimation, Markov random field.
Participants : Vladimir Krylov, Josiane Zerubia [ contact ] .
This research was partially funded by the French Space Agency (CNES). The data were provided by CNES/DLR (TerraSARX images) and the Italian Space Agency (COSMOSkyMed images).
The classification of highresolution SAR images represents a challenging problem, whose complexity is due not only to the usual presence of speckle (typical of all SAR data) but especially to the impact of various ground materials and structures on the data statistics. Unlike coarser resolution SAR, highresolution sensors provide a spatially much more detailed view of the observed scene and allow the effects of different ground materials to be observed. This requires the development of novel classification methods able jointly to model the statistics of such spatially heterogeneous data and effectively exploit the related contextual information.
In this work [21] , we proposed a contextual method that combines the Markov random field (MRF) approach to Bayesian image classification and a finite mixture technique for probability density function (pdf) estimation. We employed the `dictionarybased stochastic expectation maximization' (DSEM) scheme to model the statistics of single polarizations, and then combined the estimated marginals into joint distributions by means of copulas. The novelty of this approach [37] is in using the finite mixture technique together with copulas for the likelihood term in the MRF classifier in order to model the spatialcontextual information associated to heterogeneous highresolution SAR.
Experimental results with TerraSARX and COSMO/SkyMed data showed high estimation accuracy (above 90% correct classification), thus validating both the effectiveness of DSEM, when used to model the statistics of TerraSARX data, and the accuracy of the proposed MRFDSEM technique in the classification of highresolution SAR satellite imagery.
Advanced imageprocessing for very high resolution SAR and optical images
Keywords : risks, environment, high resolution, synthetic aperture radar (SAR), optical, Markov random fields.
Participants : Aurélie Voisin, Josiane Zerubia [ contact ] .
This work was done in collaboration with Dr. Gabriele Moser and Porf. Sebastiano Serpico from DIBE, Genoa University.
Current and future Earth observation (EO) missions, such as COSMO/SkyMed, Pléiades, and TerraSARX, have huge potential as precious information sources for monitoring human settlements and related infrastructures, thanks to their very high resolution (around 1m) and short revisit time (as little as 12–24h). In practice, in order to exploit effectively this potential, powerful processing techniques are needed. This work focuses on the development of advanced image processing and analysis methods, based on pattern recognition and stochastic modelling, as an aid to multirisk monitoring of infrastructures and urban areas.
First, special attention is devoted to SAR data (VHR COSMO/SkyMed imagery), and to modelling the statistics of high resolution SAR images. Accurately modelling these statistics is a critical task for most SAR image processing problems, for example, denoising, feature extraction, and classification. The modelling is based on the development of probability density estimation techniques. Finite mixtures will be used to model the combination of the probability density components related to distinct materials, while dictionary and generalized Gaussian methods will be adopted as flexible models for the statistics of each component. Dealing with images of urban and infrastructure areas, particular attention will be devoted to modelling separately impulsive components related to point scatterers in the scene and absolutely continuous components corresponding to distributed scattering entities. These methods have already been developed in the context of mediumresolution SAR, and were recently adapted to and validated on COSMO/SkyMed images. As compared to lowerresolution data, VHR images of urban areas or human settlements allow a spatially much more detailed view of the observed scene. This allows different materials (e.g. roofs, concrete, water, grass) present in the imaged area more strongly to affect the backscattering coefficient, thus yielding more complicated statistics, resulting from the mixture of many unknown components. Then, novel image classification methods, integrated with the probability density estimation techniques, will be proposed to map the ground areas affected by a given event (e.g. flooded areas or burnt forest areas).
Second, SAR data will be used jointly with optical data in order to analyse and effectively exploit the complementary properties of these two remotesensing data types. Specifically, an approach using Markov random fields (MRF) will be used because of its ability to fuse data. In addition, a separate datafusion approach will be exploited in order to develop multiscale classification methods that extract from the available input image(s) multiscale features (e.g. via wavelet transforms) and fuse them in the classification process to achieve both robustness to noise at coarser scales and sensitivity to spatial details at finerscales.
All the proposed methods will be extensively validated via experiments on real COSMO/SkyMed and multisensor images. Experimental results will be assessed both qualitatively and quantitatively.