Section: New Results
Audio motif and structure discovery
Motif discovery and sequence description
The work on sequence description corresponds to the Ph. D. Thesis of Romain Tavenard, in collaboration with Laurent Amsaleg from the Texmex project-team.
Audio motif discovery aims at finding repeating patterns from large audio streams in an unsupervised manner. Using the segmentation framework defined in  , we proposed a motif discovery method tolerant to variations in both the spectral and temporal domains. Our algorithm relies on several extensions of the well-known dynamic time warping algorithm to a word-spotting framework where motif boundaries are unknown. Results on a word discovery task, to appear in  , demonstrate the effectiveness of the method to retrieve repeating words or locutions in radio broadcast news data.
Audio discovery requires the fast comparison of sequences of audio descriptors in order to efficiently search for candidate motifs in a data stream. In the Ph. D. work of Romain Tavenard, we explore models of sequences of audio descriptors and the associated distances between models as an alternate approach to the traditional DTW-based solutions. In 2008, we dedicated ourselves to large scale experiments of the support vector regression (SVR) approach developped in 2007. Experimental results demonstrated the effectiveness of SVR to retrieve ads and jingles in a database containing 100 hours of radio broadcast news data. Moreover, SVR was found to be less sensitive to segmentation issues than DTW.
Structure learning in Bayesian networks
Participant : Guillaume Gravier.
Work carried out in the framework Siwar Baghdadi's Ph. D. Thesis with Thomson Multimedia and the Texmex project-team.
A key issue in statistical modeling is the design of that of model selection in order to select the best model according to the problem to solve. In the Ph. D. work of Siwar Baghdadi, we investigated several approaches to automate the model selection process in the framework of Bayesian networks applied to multimodal modeling of sport broadcasts. We emphasized the limits of the K2 structure learning algorithm (based on the Bayesian information criterion) for classification problems and investigated alternate algorithms based on discriminative criteria. Discriminative model selection algorithms were found more adequate for classification problems and eliminates the need for feature selection. Extensions of this work to structure learning in dynamic Bayesian networks were also investigated. Results on an action detection task in soccer videos demonstrate that the modeling step can be fully automated.