In this project, we have provided so far an automatic alignment at the word and phoneme levels for audio files from the corpus TCOF (Traitement de Corpus Oraux en Français). This corpus contains mainly spontaneous speech, recorded under various conditions with a large SNR range and a lot of overlapping speech. We tested different acoustic models and different adaptation methods for the forced speech-text alignment. Other corpora are currently being processed.


The work has mainly focused on the design of a corpus of French sentences and text that has been recorded by German speakers learning French, recording a corpus of German sentences read by French speakers, and tools for annotating French and German corpora. Beforehand, two preliminary small corpora have been designed and recorded in order to bring to the fore the most interesting phonetic issues to be investigated in the project. In addition this preliminary work was used to test the recording devices so as to guarantee the same quality of recording in Saarbrücken and in Nancy, and to design and develop recording software.

In this project, we also provided an automatic alignment procedure at the word and phoneme levels for 4 corpora: French sentences uttered by French speakers, French sentences uttered by German speakers, German sentences uttered by French speakers, German sentences uttered by German speakers.

ANR ContNomina


In this project, MULTISPEECH is involved for optimizing the speech recognition models for the envisaged task, and contributes also to finding the best way of presenting the speech recognition results in order to maximize the communication efficiency between the hard of hearing person and the speaking person.


The Action de Développement Technologique Inria (ADT) FASST (2012–2014) was conducted by PAROLE in collaboration with the teams PANAMA and TEXMEX of Inria Rennes. It reimplemented into efficient C++ code the Flexible Audio Source Separation Toolbox (FASST) originally developed in Matlab by the METISS team of Inria Rennes. This enabled the application of FASST on larger data sets, and its use by a larger audience. The new C++ version was released in January 2014. Two modules were also developed for HTK and Kaldi in order to perform noise robust speech recognition by uncertainty decoding.

ADT VisArtico

The technological Development Action (ADT) Inria Visartico (2013–2015) aims at developing and improving VisArtico, an articulatory vizualisation software. In addition to improving the basic functionalities, several articulatory analysis and processing tools are being integrated. We will also work on the integration of multimodal data.