Overall Objectives
Scientific Foundations
Application Domains
New Results
Contracts and Grants with Industry
Other Grants and Activities

Section: New Results

Multimedia data management

The ability to store multimedia information in digital form has spurred both the demand and offer of new electronic appliances (e.g., DVD players, digital cameras, mobile phones connected to the Web, etc.) and new applications (e.g., interactive video, digital photo album, electronic postcard, distance learning, etc.). The increasing production of digital multimedia data magnifies the traditional problems of multimedia data management and creates new problems such as content personalisation and access from mobile devices. The major issues are in the areas of multimedia data modelling, physical storage and indexing as well as query processing with multimedia data.

Scaling up multimedia indexing

Participants : José Martinez, Patrick Valduriez, Jorge Manjarrez.

An image database management system should rely on a DBMS, but we identified several shortcomings to relational DBMSs. Five key points have been determined and some answers provided : reducing the physical size of metadata, introducing replication, exploiting parallelism, classification and distribution of data, differentiated physical indexing. Combining classification and parallel computing enables, in principle, algorithms of sublinear complexity [63] . We are conducting experiments to validate this theoretical result, with several simulations considering various assumptions [62] .

Personal image collection management from mobile devices

Participants : Marc Gelgon, Antoine Pigeau, Afshin Nikseresht.

Extension of image retrieval systems to address personal image collections appears among emerging needs in both industrial and academic worlds. In particular, mobile devices such as camera-equipped phones are an interesting case for content creation and retrieval. Our objective is to recover the natural spatial, temporal or spatio-temporal structure present in such a data set. We had previously developped [54] a technique for building and tracking a hierarchical temporal and geographical structure, modelled as a hierarchy of mixture models. We have proposed an alternative to building and tracking a hierarchy of mixture models, which involves a lower computational cost and is more robust to the non-Gaussianity of clusters, as the upper level of the hierarchy is fitted to the lower-levels of the model, rather than to the data. The iterative nature of this fitting enables a predict/update mechanism as new data flows in [55] .

Decentralized, distributed learning of multimedia class models

Participants : Marc Gelgon, Afshin Nikseresht.

A fundamental task in multimedia content-based retrieval is class characterization (defining observations from audiovisual data and capturing its variability or criteria discriminating it from other classes). We assume large amounts of data, partly labelled with their class identifier, available on-line on a large scale. Since model estimation for many classes is expensive and the data is assumed initially distributed, we proposed a distributed learning technique. This context fits the flexible, dynamic distribution approach favoured in the Atlas project.

In the case of Gaussian mixture models (one of the most useful models for modelling multimedia data), we proposed a parameter estimation technique based on gossip propagation and aggregation of models through the network. Model aggregation proceeds by minimizing an approximate KL-divergence (loss), at parameter level, which avoids moving heavy multimedia data over the network. Concurrently, the reliability of models being estimated is assessed [52] , [51] .

Scaling up retrieval in a collection of mixture models

Participants : Jamal Rougui, Marc Gelgon, José Martinez.

When querying a multimedia database for the class identifier, a central quantity to evaluate is the likelihood of the query, given each candidate models stored in the database (analogous to a distance). We focus on the case where models take the form of Gaussian mixtures and we consider, as a practical application and for implementation, the speaker recognition task. While tree-based indexing structures have been intensively studied for speeding up retrieval when database entries are vectors, open issues remain when retrieval operates among probabilistic models, which are fundamental to multimedia application. We proposed techniques for building a tree of Gaussian mixture models as an index [58] , [57] . This tree is built by determining groups of models, assigned to parent node, so that evaluation of likelihood, given a parent node, supplies a value as close as possible to the one that would have been computed from its children. Due to the relation between likelihood loss and KL-divergence, optimal grouping of children into parent nodes amounts to optimizing the latter criterion.


Logo Inria