Section: New Results
Low level visual motion analysis
Clustering point trajectories with various life-spans
Participants : Matthieu Fradet, Patrick Pérez.
[In collaboration with Ph. Robert, Thomson Research]
Motion-based segmentation of a sequence of images is an essential step for many applications of video analysis, including action recognition and surveillance. In this work [33] , we introduce a new approach to motion segmentation operating on point trajectories. Each of these trajectories has its own start and end instants, hence its own life-span, depending on the pose and appearance changes of the object it belongs to. A set of such trajectories is obtained by tracking sparse interest points. Based on an adaptation of recently proposed J-linkage method, these trajectories are then clustered using series of affine motion models estimated between consecutive instants, and an appropriate residual that can handle trajectories with various life-spans. Our approach does not require any completion of trajectories whose life-span is shorter than the sequence of interest. We have evaluated the performance of the single cue of motion, without considering spatial prior and appearance. Using a standard test set, we validate our new algorithm and compare it to existing ones. Experimental results on a variety of challenging real sequences demonstrate the potential of our approach, even when no other image features (colours and contours in particular) are jointly exploited.
Multi-view synchronization of human actions and dynamic scenes
Participants : Émilie Dexter, Ivan Laptev, Patrick Pérez.
This work deals with the temporal synchronization of image sequences [31] , [40] , [41] , [30] . Two instances of this problem are considered: (a) synchronization of human actions and (b) synchronization of dynamic scenes with view changes. To address both tasks and to reliably handle large view variations, we use self-similarity matrices which remain stable across views. We propose time-adaptive descriptors that capture the structure of these matrices while being invariant to the impact of time warps between views. Synchronizing two sequences is then performed by aligning their temporal descriptors using the Dynamic Time Warping algorithm. We have conducted quantitative comparisons between time-fixed and time-adaptive descriptors for image sequences with different frame rates. We have also demonstrated the performance of the approach on several challenging videos with large view variations, drastic independent camera motions and within-class variability of human actions.
Bag of salient tracklets for video detection
Participants : François Lecellier, Hervé Jégou, Patrick Pérez.
In the context of Icos-HD ANR project 8.1.3 , we have started to investigate the use of point tracks, or tracklets , as low-level features of interest for video indexing and retrieval. Sets of such tracklets indeed capture distinctive motion information (as demonstrated in our works on multi-view video alignment 6.2.2 and view-independent action recognition 6.3.1 ) that should be complementary to classic static descriptors such as SIFT or SURF for retrieval tasks. In particular, when individual frame representations become highly ambiguous (e.g., when searching a given football shot in a databasis of football broadcasts), it is hoped that describing the dynamic content of the video will be of much help. To this end, we have proposed ways to characterize the saliency of tracklets and to describe them compactly for search purpose, leading to the concept of bag of salient tracklets. The experimental validation of such a description for video copy detection is currently pursued.
Stochastic filtering technique for the tracking of closed curves
Participant : Patrick Pérez.
[In collaboration with E. Mémin and Ch. Avenel, Fluminance project-team]
See Fluminance activity report.
Visual tracking by online selection of tracking filters
Participants : Vijay Badrinarayanan, Patrick Pérez.
[In collaboration with Lionel Oisel and François Le Clerc, Thomson Research]
We have developed a probabilistic graphical model that integrates multiple tracking filters for target state estimation. Based on a special form of parametric graphical prior, this model enables estimation of the target state by a linear combination of posteriors propagated by a selected set of tracking filters. A novel generic multi-cue switch tracker with a dynamically changing set of point trackers and a color based particle filter is derived from this model. Each point tracker is cast as a pseudo-random particle filter to enable tractable inference on the proposed model. It is argued by way of experiments that, in addition to the advantages contained in the fusion of multiple filters, the tracker also provides an effective way to tackle the problem of online target reference model update. Experimental results on very challenging sequences and comparative studies indicate this tracker is suited for robust position tracking over extended durations. The proposed tracker is compared, qualitatively and quantitatively, to appropriate standard schemes in the literature, to highlight the advantages and bring out the drawbacks of the proposed approach.