Team METISS

Members
Overall Objectives
Scientific Foundations
Application Domains
Software
New Results
Contracts and Grants with Industry
Other Grants and Activities
Dissemination
Bibliography

Section: New Results

Content description of music signals

Multi-pitch signal modeling

Participant : Emmanuel Vincent.

Main collaborations: N. Bertin and R. Badeau (Telecom ParisTech)

Music involves several levels of information, from the acoustic signal up to cognitive quantities such as composer style or key, through mid-level quantities such as a musical score or a sequence of chords. The dependencies between mid-level and lower- or higher-level information can be represented through acoustic models and language models, respectively. Our past work on nonnegative matrix factorization (NMF)-based acoustic models was finalized and led to several publications [32] , [23] , [33] , [34] . These models represent an input short-term magnitude spectrum as a linear combination of magnitude spectra corresponding to different pitches, which are adapted to the input under harmonicity and temporal smoothness constraints. Besides, the convergence properties of NMF algorithms were analyzed [22] .

Music language modeling

Participants : Emmanuel Vincent, Frédéric Bimbot.

Main collaboration: Ricardo Scholz (internship student),

We started working on the modeling of music as a language by studying N-gram models of chord sequences. We investigated various chord labelling schemes and various model smoothing techniques originally designed for spoken language processing. While state-of-the-art models consider N=2, we showed that more accurate models with N > 2 could be learned from a limited set of data [54] .

Additional investigations (in the context of Christophe Hauser's internship) were carried out on how to integrate the language model with the acoustic level decoding, but did not reach sufficient maturity yet, to draw clear conclusions.

Music structuring

Participants : Frédéric Bimbot, Gabriel Sargent, Emmanuel Vincent.

In the context of the QUAERo Projec, we started investigating on various ways of describing the structure of musical content, with the double purpose of proposing a data model for annotation and a simple paradigm for automatic algorithms.

Initially, a multi-layered approach was considered, composed of seven parallel layers of information: key, color, tempo, melody, rhythm, harmony and lyrics, which together govern most of the structure of a piece of music. Some of these layers are characterized by the existence of statistical changes (key, color, tempo) whereas the others are mostly structured by the presence of recurrent patterns (melody, rhythm, harmony and lyrics).

Above this, it appears that, in many situations, the characterization of the structure of a music piece is governed by a high level structure linked to constant numbers of beats. This property serves as an anchor point for structure description and annotation and will be a central point of study in the PhD of Gabriel Sargent.


previous
next

Logo Inria