Members
Overall Objectives
Research Program
Application Domains
New Software and Platforms
New Results
Bilateral Contracts and Grants with Industry
Partnerships and Cooperations
Dissemination
Bibliography
XML PDF e-pub
PDF e-Pub


Bibliography

Major publications by the team in recent years
[1]
F. Bahja, J. Di Martino, E. H. Ibn Elhaj, D. Aboutajdine.
An overview of the CATE algorithms for real-time pitch determination, in: Signal, Image and Video Processing, 2013. [ DOI : 10.1007/s11760-013-0488-4 ]
https://hal.inria.fr/hal-00831660
[2]
J. Barker, E. Vincent, N. Ma, H. Christensen, P. Green.
The PASCAL CHiME Speech Separation and Recognition Challenge, in: Computer Speech and Language, February 2013, vol. 27, no 3, pp. 621-633. [ DOI : 10.1016/j.csl.2012.10.004 ]
https://hal.inria.fr/hal-00743529
[3]
A. Bonneau, D. Fohr, I. Illina, D. Jouvet, O. Mella, L. Mesbahi, L. Orosanu.
Gestion d'erreurs pour la fiabilisation des retours automatiques en apprentissage de la prosodie d'une langue seconde, in: Traitement Automatique des Langues, 2013, vol. 53, no 3.
https://hal.inria.fr/hal-00834278
[4]
D. Jouvet, D. Fohr.
Combining Forward-based and Backward-based Decoders for Improved Speech Recognition Performance, in: InterSpeech - 14th Annual Conference of the International Speech Communication Association - 2013, Lyon, France, August 2013.
https://hal.inria.fr/hal-00834282
[5]
A. Ozerov, M. Lagrange, E. Vincent.
Uncertainty-based learning of acoustic models from noisy data, in: Computer Speech and Language, February 2013, vol. 27, no 3, pp. 874-894. [ DOI : 10.1016/j.csl.2012.07.002 ]
https://hal.inria.fr/hal-00717992
[6]
A. Ozerov, E. Vincent, F. Bimbot.
A General Flexible Framework for the Handling of Prior Information in Audio Source Separation, in: IEEE Transactions on Audio, Speech and Language Processing, May 2012, vol. 20, no 4, pp. 1118 - 1133, 16.
https://hal.archives-ouvertes.fr/hal-00626962
Publications of the year

Doctoral Dissertations and Habilitation Theses

[7]
A. Gorin.
Acoustic Model Structuring for Improving Automatic Speech Recognition Performance, University of Lorraine, November 2014.
https://hal.inria.fr/tel-01102029

Articles in International Peer-Reviewed Journals

[8]
A. Benichoux, L. S. R. Simon, E. Vincent, R. Gribonval.
Convex regularizations for the simultaneous recording of room impulse responses, in: IEEE Transactions on Signal Processing, January 2014. [ DOI : 10.1109/TSP.2014.2303431 ]
https://hal.inria.fr/hal-00934941
[9]
C. Fauth, A. Bonneau, O. Mella, V. Colotte, D. Fohr, D. Jouvet, Y. Laprie, J. Trouvain.
Constitution d'un Corpus de Français Langue Etrangère destiné aux Apprenants Allemands, in: SHS Web of Conferences, July 2014, vol. 8, 14 p. [ DOI : 10.1051/shsconf/20140801186 ]
https://hal.inria.fr/hal-01080630
[10]
N. Ito, E. Vincent, T. Nakatani, N. Ono, S. Araki, S. Sagayama.
Blind suppression of nonstationary diffuse noise based on spatial covariance matrix decomposition, in: Journal of Signal Processing Systems, July 2014.
https://hal.inria.fr/hal-01020255
[11]
Y. Laprie, R. Sock, B. Vaxelaire, B. Elie.
Comment faire parler les images aux rayons X du conduit vocal ?, in: SHS Web of Conferences, July 2014, vol. 8, 14 p. [ DOI : 10.1051/shsconf/20140801344 ]
https://hal.inria.fr/hal-01059887
[12]
N. Liu, A. Liutkus, J.-F. Aubry, L. Marsac, M. Tanter, L. Daudet.
Random Calibration for Accelerating MR-ARFI Guided Ultrasonic Focusing in Transcranial Therapy, in: Physics in Medicine and Biology, January 2015, vol. 60, no 3, 21 p. [ DOI : 10.1088/0031-9155/60/3/1069 ]
https://hal.inria.fr/hal-01104616
[13]
A. Liutkus, D. Fitzgerald, Z. Rafii, B. Pardo, L. Daudet.
Kernel Additive Models for Source Separation, in: IEEE Transactions on Signal Processing, June 2014. [ DOI : 10.1109/TSP.2014.2332434 ]
https://hal.inria.fr/hal-01011044
[14]
A. Liutkus, D. Martina, S. Popoff, G. Chardon, O. Katz, G. Lerosey, S. Gigan, L. Daudet, I. Carron.
Imaging With Nature: Compressive Imaging Using a Multiply Scattering Medium, in: Scientific Reports, July 2014, vol. 4. [ DOI : 10.1038/srep05552 ]
https://hal.inria.fr/hal-01025647
[15]
S. Raczynski, E. Vincent.
Genre-based music language modelling with latent hierarchical Pitman-Yor process allocation, in: IEEE/ACM Transactions on Audio, Speech, and Language Processing, January 2014, vol. 22, no 3, pp. 672-681.
https://hal.inria.fr/hal-00804567
[16]
E. Vincent, N. Bertin, R. Gribonval, F. Bimbot.
From blind to guided audio source separation: How models and side information can improve the separation of sound, in: IEEE Signal Processing Magazine, May 2014, vol. 31, no 3, pp. 107-115.
https://hal.inria.fr/hal-00922378

Invited Conferences

[17]
E. Vincent, A. Sini, F. Charpillet.
Audio source localization by optimal control of a mobile robot, in: 40th IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Brisbane, Australia, April 2015.
https://hal.inria.fr/hal-01103949

International Conferences with Proceedings

[18]
K. Bartkova, D. Jouvet.
Links between Manual Punctuation Marks and Automatically Detected Prosodic Structures, in: Speech Prosody 2014, Dublin, Ireland, May 2014.
https://hal.archives-ouvertes.fr/hal-00998031
[19]
J. Beliao, A. Liutkus.
OOPS: une approche orientée objet pour l'interrogation et l'analyse linguistique de l'interface prosodie/syntaxe/discours, in: 4e Congrès Mondial de Linguistique Française, Berlin, Germany, July 2014, vol. 8, pp. 2565-2581. [ DOI : 10.1051/shsconf/20140801273 ]
https://hal.archives-ouvertes.fr/hal-01053422
[20]
F. Bimbot, G. Sargent, E. Deruty, C. Guichaoua, E. Vincent.
Semiotic Description of Music Structure: an Introduction to the Quaero/Metiss Structural Annotations, in: AES 53rd International Conference on Semantic Audio, London, United Kingdom, January 2014, 12 p, P1-1.
https://hal.archives-ouvertes.fr/hal-00931859
[21]
B. Dumortier, E. Vincent.
Blind RT60 estimation robust across room sizes and source distances, in: IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Firenze, Italy, May 2014.
https://hal.inria.fr/hal-00941061
[22]
B. Elie, Y. Laprie.
Audiovisual to area and length functions inversion of human tract , in: Eusipco 2014, Lisbonne, Portugal, September 2014.
https://hal.inria.fr/hal-01096547
[23]
C. Fauth, A. Bonneau.
L1-L2 interference: the case of devoicing of French voiced obstruents in final position by German learners - Pilot study, in: International Workshop on Multilinguality in Speech Research: Data, Methods and Models, Dagstuhl, Germany, Bernd Möbius et Jürgen Trouvain, Université de la Sarre, Allemagne, April 2014.
https://hal.inria.fr/hal-01095183
[24]
C. Fauth, A. Bonneau, F. Zimmerer, J. Trouvain, B. Andreeva, V. Colotte, D. Fohr, D. Jouvet, J. Jügler, Y. Laprie, O. Mella, B. Möbius.
Designing a Bilingual Speech Corpus for French and German Language Learners: a Two-Step Process, in: LREC - 9th Language Resources and Evaluation Conference, Reykjavik, Iceland, The European Language Resources Association, May 2014.
https://hal.inria.fr/hal-00979026
[25]
D. Fitzgerald, A. Liutkus, Z. Rafii, B. Pardo, L. Daudet.
Harmonic/Percussive Separation Using Kernel Additive Modelling, in: IET Irish Signals & Systems Conference 2014, Limerick, Ireland, June 2014.
https://hal.inria.fr/hal-01000001
[26]
A. Gorin, D. Jouvet.
Component Structuring and Trajectory Modeling for Speech Recognition, in: Interspeech, Singapoore, Singapore, September 2014.
https://hal.inria.fr/hal-01063653
[27]
A. Gorin, D. Jouvet.
Explicit trajectories and speaker class modeling for child and adult speech recognition, in: XXXème édition des Journées d'Etudes sur la Parole, Le Mans, France, June 2014.
https://hal.inria.fr/hal-01080343
[28]
A. Gorin, D. Jouvet.
Structured GMM Based on Unsupervised Clustering for Recognizing Adult and Child Speech, in: SLSP - 2nd International Conference on Statistical Language and Speech Processing, Grenoble, France, October 2014, pp. 108 - 119. [ DOI : 10.1007/978-3-319-11397-5_8 ]
https://hal.inria.fr/hal-01090472
[29]
A. Gorin, D. Jouvet, E. Vincent, D. Tran.
Investigating Stranded GMM for Improving Automatic Speech Recognition, in: 4th Joint Workshop on Hands-free Speech Communication and Microphone Arrays (HSCMA 2014), Nancy, France, May 2014.
https://hal.inria.fr/hal-01003054
[30]
I. Illina, D. Fohr, G. Linares.
Extension du vocabulaire d’un système de transcription avec de nouveaux noms propres en utilisant un corpus diachronique, in: Journées d'Etude sur la parole, Le Mans, France, June 2014.
https://hal.inria.fr/hal-01092214
[31]
I. Illina, D. Fohr, G. Linares.
Proper Name Retrieval from Diachronic Documents for Automatic Speech Transcription using Lexical and Temporal Context, in: Workshop on Speech, Language and Audio in Multimedia, Penang, Malaysia, September 2014.
https://hal.inria.fr/hal-01092224
[32]
X. Jaureguiberry, E. Vincent, G. Richard.
Multiple-order non-negative matrix factorization for speech enhancement, in: Interspeech, Singapore, June 2014, 4 p.
https://hal.archives-ouvertes.fr/hal-01023399
[33]
X. Jaureguiberry, E. Vincent, G. Richard.
Variational Bayesian model averaging for audio source separation, in: SSP (IEEE Workshop on Statistical Signal Processing), Australia, June 2014, 4 p.
https://hal.archives-ouvertes.fr/hal-00986909
[34]
D. Jouvet, D. Fohr.
About Combining Forward and Backward-Based Decoders for Selecting Data for Unsupervised Training of Acoustic Models, in: INTERSPEECH 2014, 15th Annual Conference of the International Speech Communication Association, Singapour, Singapore, September 2014.
https://hal.inria.fr/hal-01090483
[35]
S. Kırbız, A. Ozerov, A. Liutkus, L. Girin.
Perceptual coding-based informed source separation, in: 22nd European Signal Processing Conference (EUSIPCO-2014), Lisbonne, Portugal, September 2014.
https://hal.inria.fr/hal-01016314
[36]
O. Lachhab, J. Di Martino, E. H. Ibn Elhaj, A. Hammouch.
Improving the recognition of pathological voice using the discriminant HLDA transformation, in: 3rd International IEEE Colloquium on Information Science and Technology, Tetuan-Chefchaouen, Morocco, October 2014.
https://hal.inria.fr/hal-01093309
[37]
Y. Laprie, M. Aron, M.-O. Berger, B. Wrobel-Dautcourt.
Studying MRI acquisition protocols of sustained sounds with a multimodal acquisition system, in: 10th International Seminar on Speech Production (ISSP), Köln, Germany, May 2014.
https://hal.inria.fr/hal-01002121
[38]
Y. Laprie, B. Vaxelaire, M. Cadot.
Geometric articulatory model adapted to the production of consonants, in: 10th International Seminar on Speech Production (ISSP), Köln, Germany, May 2014.
https://hal.inria.fr/hal-01002125
[39]
A. Liutkus, R. Badeau.
Generalized Wiener filtering with fractional power spectrograms, in: 40th International Conference on Acoustics, Speech and Signal Processing (ICASSP), Brisbane, Australia, IEEE, April 2015.
https://hal.archives-ouvertes.fr/hal-01110028
[40]
A. Liutkus, D. Fitzgerald, Z. Rafii.
Scalable audio separation with light kernel additive modelling, in: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Brisbane, Australia, IEEE, April 2015.
https://hal.inria.fr/hal-01114890
[41]
A. Liutkus, D. Martina, S. Gigan, L. Daudet.
Compressed sensing under strong noise. Application to imaging through multiply scattering media, in: European Signal Processing Conference (EUSIPCO), Lisbon, Portugal, September 2014.
https://hal.inria.fr/hal-01074786
[42]
A. Liutkus, Z. Rafii, B. Pardo, D. Fitzgerald, L. Daudet.
Kernel Spectrogram models for source separation, in: HSCMA, Nancy, France, May 2014.
https://hal.inria.fr/hal-00959384
[43]
U. Musti, S. Ouni, Z. Ziheng.
3D Visual Speech Animation from Image Sequences, in: Indian Conference on Computer Vision, Graphics and Image Processing (ICVGIP), Bangalore, India, ACM, December 2014.
https://hal.archives-ouvertes.fr/hal-01086073
[44]
L. Orosanu, D. Jouvet.
Combining words and syllables for speech transcription, in: XXXème édition des Journées d'Etudes sur la Parole, Le Mans, France, June 2014.
https://hal.inria.fr/hal-01080351
[45]
L. Orosanu, D. Jouvet.
Hybrid language models for speech transcription, in: INTERSPEECH 2014, 15th Annual Conference of the International Speech Communication Association, Singapour, Singapore, September 2014.
https://hal.inria.fr/hal-01090478
[46]
N. Souviraà-Labastie, A. Olivero, E. Vincent, F. Bimbot.
Audio source separation using multiple deformed references, in: Eusipco, Lisboa, Portugal, September 2014.
https://hal.inria.fr/hal-01017571
[47]
N. Souviraà-Labastie, E. Vincent, F. Bimbot.
Music separation guided by cover tracks: designing the joint NMF model, in: 40th IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Brisbane, Australia, April 2015.
https://hal.archives-ouvertes.fr/hal-01108675
[48]
I. Steiner, P. Knopp, S. Musche, A. Schmiedel, A. Braun, S. Ouni.
Investigating the effects of posture and noise on speech production, in: 10th International Seminar on Speech Production (ISSP), Cologne, Germany, Susanne Fuchs, Martine Grice, Anne Hermes, Leonardo Lancia, Doris Mücke, May 2014.
https://hal.archives-ouvertes.fr/hal-01086066
[49]
D. Tran, N. Ono, E. Vincent.
Fast DNN training based on auxiliary function technique, in: 40th IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Brisbane, Queensland, Australia, April 2015.
https://hal.inria.fr/hal-01107809
[50]
D. Tran, E. Vincent, D. Jouvet.
Extension of uncertainty propagation to dynamic MFCCs for noise robust ASR , in: IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Florence, Italy, May 2014.
https://hal.inria.fr/hal-00954654
[51]
D. Tran, E. Vincent, D. Jouvet.
Fusion of Multiple Uncertainty Estimators and Propagators for Noise Robust ASR, in: IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Florence, Italy, May 2014.
https://hal.inria.fr/hal-00955185
[52]
D. Tran, E. Vincent, D. Jouvet.
Discriminative uncertainty estimation for noise robust ASR, in: 40th IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Brisbane, Queensland, Australia, April 2015.
https://hal.inria.fr/hal-01103969
[53]
E. Vincent, A. Gkiokas, D. Schnitzer, A. Flexer.
An investigation of likelihood normalization for robust ASR, in: Interspeech, Singapore, Singapore, September 2014.
https://hal.inria.fr/hal-01006142

National Conferences with Proceedings

[54]
M. Cadot, Y. Laprie.
Méthodologie 3-way d'extraction d'un modèle articulatoire de la parole à partir des données d'un locuteur, in: Atelier Fouille de Données Complexes des 14èmes Journées Francophones "Extraction et Gestion des Connaissances", Rennes, France, January 2014, pp. 1-12.
https://hal.archives-ouvertes.fr/hal-00934436
[55]
J. Thiemann, E. Vincent, S. Van De Par.
Spatial properties of the DEMAND noise recordings, in: 40th Annual German Congress on Acoustics (DAGA 2014), Oldenburg, Germany, March 2014.
https://hal.inria.fr/hal-00985979

Conferences without Proceedings

[56]
P.-A. Vuissoz, F. Odille, Y. Laprie, E. Vincent, G. Hossu, J. Felblinger.
Speech Cine SSFP with optical microphone synchronization and motion compensated reconstruction, in: ISMRM Workshop on Motion Correction in MRI, Tromso, Norway, July 2014.
https://hal.inria.fr/hal-00994526
[57]
P.-A. Vuissoz, F. Odille, E. Vincent, J. Felblinger, Y. Laprie.
Synchronisation vocale et mouvement compensé en reconstruction pour une ciné IRM de la parole, in: 2e Congrès de la SFRMBM, Grenoble, France, March 2015.
https://hal.inria.fr/hal-01104230

Scientific Books (or Scientific Book chapters)

[58]
Z. Rafii, A. Liutkus, B. Pardo.
REPET for Background/Foreground Separation in Audio, in: Blind Source Separation, G. Naik, W. Wang (editors), Springer Berlin Heidelberg, 2014, pp. 395-411. [ DOI : 10.1007/978-3-642-55016-4_14 ]
https://hal.inria.fr/hal-01025563

Internal Reports

[59]
R. Badeau, A. Liutkus.
Proof of Wiener-like linear regression of isotropic complex symmetric alpha-stable random variables, September 2014.
https://hal.archives-ouvertes.fr/hal-01069612
[60]
J. Le Roux, E. Vincent.
A categorization of robust speech processing datasets, September 2014, no Mitsubishi Electric Research Labs TR2014-116.
https://hal.inria.fr/hal-01063805
[61]
A. Liutkus.
Scale-Space Peak Picking, Inria Nancy - Grand Est (Villers-lès-Nancy, France), January 2015.
https://hal.inria.fr/hal-01103123

Scientific Popularization

[62]
E. Vincent.
Les sons à domicile, April 2014, Séminaire SAILOR "Imaginer des nouveaux lieux de vie", Séminaire SAILOR "Imaginer des nouveaux lieux de vie".
https://hal.inria.fr/hal-00977674

Other Publications

[63]
A. Bonneau.
Phonetic variation in non-native speech, April 2014, Spring School : "Individual-centered Approaches to Speech Processing".
https://hal.inria.fr/hal-01095804
[64]
A. Piquard-Kipffer.
Critères d’évaluation d’un album numérique pour des enfants en difficulté de langage, December 2014, pp. 287-309, In M. Frisch (Eds) Le réseau Idéki : objets de recherche, d’éducation et de formation émergents, problématisés, mis en tension, réélaborés. Préface de Joël Lebeaume. Paris : L’harmattan, Collection I.D, 287-309.
https://hal.inria.fr/hal-01097278
[65]
Y. Salaün, E. Vincent, N. Bertin, N. Souviraà-Labastie, X. Jaureguiberry, D. T. Tran, F. Bimbot.
The Flexible Audio Source Separation Toolbox Version 2.0, May 2014, ICASSP.
https://hal.inria.fr/hal-00957412
[66]
N. Souviraà-Labastie, A. Olivero, E. Vincent, F. Bimbot.
Multi-channel audio source separation using multiple deformed references, November 2014.
https://hal.inria.fr/hal-01070298
[67]
D. T. Tran, E. Vincent, D. Jouvet.
Nonparametric uncertainty estimation and propagation for noise robust ASR, January 2015.
https://hal.inria.fr/hal-01114329
[68]
E. Vincent.
Evaluation campaigns and reproducibility, January 2014, Journée GdR ISIS "reproductibilité en traitement du signal et des images".
https://hal.inria.fr/hal-00927741
References in notes
[69]
F. Bahja.
Détection du fondamental de la parole en temps réel : application aux voix pathologiques, Université Mohammed V-Agdal UFR Informatique et Télécommunications Laboratoire LRIT Unité associée au CNRST, URAC 29, Faculté des sciences, June 2013.
https://tel.archives-ouvertes.fr/tel-00927147
[70]
D. Fohr, O. Mella.
CoALT: A Software for Comparing Automatic Labelling Tools, in: Language Resources and Evaluation LREC 2012, Istanbul, Turkey, May 2012, pp. 325-328.
https://hal.archives-ouvertes.fr/hal-00761781
[71]
D. Jouvet, D. Fohr.
Analysis and Combination of Forward and Backward based Decoders for Improved Speech Transcription, in: TSD - 16th International Conference on Text, Speech and Dialogue - 2013, Pilsen, Czech Republic, I. Habernal, V. Matoušek (editors), Lecture Notes in Artificial Intelligence, Springer Verlag, September 2013, vol. 8082, pp. 84-91.
https://hal.inria.fr/hal-00834296
[72]
S. Ouni, L. Mangeonjean, I. Steiner.
VisArtico: a visualization tool for articulatory data, in: 13th Annual Conference of the International Speech Communication Association - InterSpeech 2012, Portland, OR, United States, September 2012.
https://hal.inria.fr/hal-00730733