Team METISS

Members
Overall Objectives
Scientific Foundations
Application Domains
Software
New Results
Contracts and Grants with Industry
Other Grants and Activities
Dissemination
Bibliography

Bibliography

Major publications by the team in recent years

[1]
S. Galliano, E. Geoffrois, D. Mostefa, K. Choukri, J.-F. Bonastre, G. Gravier.
The ESTER Phase II Evaluation Campaign for the Rich Transcription of French Broadcast News, in: European Conference on Speech Communication and Technology, 2005.
[2]
R. Gribonval, R. M. Figueras i Ventura, P. Vandergheynst.
A simple test to check the optimality of sparse signal approximations, in: EURASIP Signal Processing, special issue on Sparse Approximations in Signal and Image Processing, March 2006, vol. 86, no 3, p. 496–510.
[3]
R. Gribonval.
Sur quelques problèmes mathématiques de modélisation parcimonieuse, Habilitation à Diriger des Recherches, spécialité “Mathématiques”, Université de Rennes I, octobre 2007.
[4]
R. Gribonval, M. Nielsen.
On approximation with spline generated framelets, in: Constructive Approx., January 2004, vol. 20, no 2, p. 207–232.
[5]
R. Gribonval, P. Vandergheynst.
On the exponential convergence of Matching Pursuits in quasi-incoherent dictionaries, in: IEEE Trans. Information Theory, January 2006, vol. 52, no 1, p. 255–261.
[6]
E. Kijak, G. Gravier, L. Oisel, P. Gros.
Audiovisual integration for tennis broadcast structuring, in: Multimedia Tools and Application, 2006, vol. 30, no 3, p. 289–312.
[7]
A. Ozerov, P. Philippe, F. Bimbot, R. Gribonval.
Adaptation of Bayesian models for single channel source separation and its application to voice / music separation in popular songs, in: IEEE Trans. Audio, Speech and Language Processing, juillet 2007, vol. 15, no 5, p. 1564–1578.
[8]
A. Rosenberg, F. Bimbot, S. Parthasarathy.
36, in: Overview of Speaker Recognition, Y. H. J. Benesty (editor), Springer, 2008, p. 725–741.
[9]
E. Vincent, R. Gribonval, C. Févotte.
Performance measurement in Blind Audio Source Separation, in: IEEE Trans. Speech, Audio and Language Processing, 2006, vol. 14, no 4, p. 1462–1469.
[10]
E. Vincent, M. Plumbley.
Low bitrate object coding of musical audio using bayesian harmonic models, in: IEEE Trans. on Audio, Speech and Language Processing, 2007, vol. 15, no 4, p. 1273–1282.

Publications of the year

Doctoral Dissertations and Habilitation Theses

[11]
S. Arberet.
Estimation robuste et apprentissage aveugle de modèles pour la séparation de sources sonores, Ph. D. Thesis, Université de Rennes I, december 2008.
[12]
W. Teng.
Adaptation rapide au locuteur par sous-espace variable de modèles de référence, Ph. D. Thesis, Université de Rennes I, december 2008.

Articles in International Peer-Reviewed Journal

[13]
L. Borup, R. Gribonval, M. Nielsen.
Beyond coherence : recovering structured time-frequency representations, in: Appl. Comput. Harmon. Anal., 2008, vol. 24, no 1, p. 120–128.
[14]
M. Delakis, G. Gravier, P. Gros.
Audiovisual Integration with Segment Models for Tennis Video Parsing, in: Computer Vision and Image Understanding, August 2008, vol. 111, no 2, p. 142–154.
[15]
G. Gonon, F. Bimbot, R. Gribonval.
Probabilistic scoring using decision trees for fast and scalable speaker recognition, in: Speech Communication, to appear, 2009.
[16]
R. Gribonval, M. Nielsen.
Beyond sparsity : recovering structured representations by $ \ell$1 -minimization and greedy algorithms, in: Advances in Computational Mathematics, January 2008, vol. 28, no 1, p. 23–41.
[17]
R. Gribonval, H. Rauhut, K. Schnass, P. Vandergheynst.
Atoms of all channels, unite ! average case analysis of multi-channel sparse recovery using greedy algorithms, 2008.
[18]
M. Jafari, E. Vincent, S. Abdallah, M. Plumbley, M. Davies.
An adaptive stereo basis method for convolutive blind audio source separation, in: Neurocomputing, 2008, vol. 71, no 10–12, p. 2087–2097.
[19]
P. Leveau, E. Vincent, G. Richard, L. Daudet.
Instrument-specific harmonic atoms for mid-level music representation, in: IEEE Transactions on Audio, Speech and Language Processing, 2008, vol. 16, no 1, p. 116–128.
[20]
E. Vincent, M. Plumbley.
Efficient Bayesian inference for harmonic models via adaptive posterior factorization, in: Neurocomputing, 2008, vol. 72, p. 79–87.

Articles in National Peer-Reviewed Journal

[21]
A. Bürki, C. Gendrot, G. Gravier, G. Linarès, C. Fougeron.
Alignement automatique et analyse phonétique : comparaison de différents systèmes pour l'analyse du schwa, in: Traitement Automatique des Langues, 2008, vol. 49, no 3.

International Peer-Reviewed Conference/Proceedings

[22]
A. L. Casanovas, G. Monaci, P. Vandergheynst, R. Gribonval.
Blind Audiovisual Separation based on Redundant Representations, in: Proc. of IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP, 2008.
[23]
R. Gribonval, K. Schnass.
Dictionary identifiability from few training samples, in: Proc. European Conf. on Signal Processing - EUSIPCO, August 2008.
[24]
R. Gribonval, K. Schnass.
Some recovery conditions for basis learning by l1-minimization, in: 3rd IEEE International Symposium on Communications, Control and Signal Processing - ISCCSP 2008, March 2008, p. 768–773.
[25]
S. Huet, G. Gravier, P. Sébillot.
Morphosyntactic Resources for Automatic Speech Recognition, in: Intl. Conf. on Language, Resources and Evaluation, 2008.
[26]
M. Kowalski, E. Vincent, R. Gribonval.
Under-determined source separation via mixed-norm regularized minimization, in: Proc. European Signal Processing Conf. - EUSIPCO, 2008.
[27]
G. Lecorvé, G. Gravier, P. Sébillot.
An unsupervied Web-based topic language model adaptation method, in: IEEE Intl. Conf. on Acoustics, Speech and Signal Processing, 2008.
[28]
G. Lecorvé, G. Gravier, P. Sébillot.
Using Internet as a corpus ..., in: Intl. Conf. on Language, Resources and Evaluation, 2008.
[29]
B. Lecouteux, G. Linarès, Y. Estève, G. Gravier.
Generalized driven decoding for speech recognition system combination, in: IEEE Intl. Conf. on Acoustics, Speech and Signal Processing, 2008.
[30]
B. Mailhé, R. Gribonval, F. Bimbot, M. Lemay, P. Vandergheynst, J.-M. Vesin.
Dictionary learning for the sparse modelling of atrial fibrillation in ECG signals, in: ICASSP'09, 2009.
[31]
B. Mailhé, R. Gribonval, F. Bimbot, P. Vandergheynst.
A low complexity orthogonal matching pursuit for sparse signal approximation with shift-invariant dictionaries, in: ICASSP'09, 2009.
[32]
B. Mailhé, S. Lesage, R. Gribonval, P. Vandergheynst, F. Bimbot.
Shift-invariant dictionary learning for sparse representations: extending k-SVD, in: Proc. European Conf. on Signal Processing - EUSIPCO, 2008.
[33]
A. Muscariello, G. Gravier, F. Bimbot.
Variability tolerant motif discovery, in: Intl. Multimedia Model Conference, 2009.
[34]
F. Naini, R. Gribonval, L. Jacques, P. Vandergheynst.
Compressive sampling of pulse trains : Spread the spectrum !, in: ICASSP'09, 2009.
[35]
A. Nesbit, M. Plumbley, E. Vincent.
Oracle evaluation of flexible adaptive transforms for underdetermined audio source separation, in: Proc. UK ICA Research Network International Workshop, University of Liverpool, 2008.
[36]
P. Sudhakar, R. Gribonval.
A sparsity-based method to solve the permutation indeterminacy in frequency domain convolutive blind source separation, in: ICA'09, submitted, 2009.
[37]
E. Vincent, N. Bertin, R. Badeau.
Harmonic and inharmonic nonnegative matrix factorization for polyphonic pitch transcription, in: Proc. IEEE Int. Conf. on Acoustics, Speech and Signal Processing (ICASSP), 2008, p. 109–112.

National Peer-Reviewed Conference/Proceedings

[38]
S. Huet, G. Gravier, P. Sébillot.
Un modèle multi-sources pour la segmentation en sujets de journaux radiophoniques, in: Proc. Traitement Automatique des Langues Naturelles, 2008.
[39]
G. Lecorvé, G. Gravier, P. Sébillot.
Vers une adaptation thématique non supervisée de modèles de langage : utilisation d'Internet comme un corpus ouvert, in: Journées d'Études sur la Parole, 2008.
[40]
B. Lecouteux, G. Linarès, Y. Estève, G. Gravier.
Combinaison de systèmes par décodage guidé, in: Journées d'Études sur la Parole, 2008.

Scientific Books (or Scientific Book chapters)

[41]
F. Bimbot.
9, in: Automatic Speaker Recognition, J. Mariani (editor), 35 pages – to appear, Hermès, 2009.
[42]
M. Delakis, G. Gravier, P. Gros.
Stochastic models for multimodal video analysis, in: Multimodal Processing and Interaction: Audio, Video, Text, P. Maragos, A. Potamianos, P. Gros (editors), Springer Verlag, 2008.
[43]
G. Gravier, J.-F. Bonastre, S. Galliano, E. Geoffrois, D. Mostefa, K. Choukri.
Évaluation des systèmes de transcription enrichie d'émissions radiophoniques, in: L'évaluation des technologies de traitement de la langue, S. Chaudiron, K. Choukri (editors), Cognition et traitement de l'information, Hermès Science, 2008, chap. 7, p. 165–182.
[44]
S. Huet, G. Gravier, P. Sébillot.
Toward the integration of NLP and ASR techniques: POS tagging and transcription, in: Multimodal Processing and Interaction: Audio, Video, Text, P. Maragos, A. Potamianos, P. Gros (editors), Springer Verlag, 2008.
[45]
A. Rosenberg, F. Bimbot, S. Parthasarathy.
36, in: Overview of Speaker Recognition, Y. H. J. Benesty (editor), Springer, 2008, p. 725–741.

Internal Reports

[46]
S. Arberet, R. Gribonval, F. Bimbot.
A robust method to count, locate and separate audio sources in a multichannel underdetermined mixture, Technical report, INRIA Research Report, 2008, no 6593.
[47]
M. Davies, R. Gribonval.
Restricted isometry constants where $ \ell$p sparse recovery can fail for 0< p$ \le$1 , Technical report, IRISA-INRIA Technical Report, July 2008, no 1899.

Other Publications

[48]
S. Arberet, A. Ozerov, R. Gribonval, F. Bimbot.
Blind spectral-GMM estimation for underdetermined instantaneous audio source separation, submitted, 2009.
[49]
A. Nesbit, E. Vincent, M. Plumbley.
Benchmarking flexible adaptive time-frequency transforms for underdetermined audio source separation, submitted, 2009.
[50]
A. Nesbit, E. Vincent, M. Plumbley.
Extension of sparse, adaptive signal decompositions to semi-blind audio source separation, submitted, 2009.
[51]
M. Puigt, E. Vincent, Y. Deville.
Validity of the independence assumption for the separation of instantaneous and convolutive mixtures of speech and music sources, submitted, 2009.
[52]
R. Scholz, E. Vincent, F. Bimbot.
Robust modeling of musical chord sequences using probabilistic N-grams, in: ICASSP'09, 2009.
[53]
E. Vincent, S. Araki, P. Bofill.
The 2008 Signal Separation Evaluation Campaign: A community-based approach to large-scale evaluation, submitted, 2009.
[54]
E. Vincent, S. Arberet, R. Gribonval.
Underdetermined instantaneous audio source separation via local Gaussian modeling, submitted, 2009.

References in notes

[55]
R. Baraniuk.
Compressive sensing, in: IEEE Signal Processing Magazine, July 2007, vol. 24, no 4, p. 118–121.
[56]
R. Boite, H. Bourlard, T. Dutoit, J. Hancq, H. Leich.
Traitement de la Parole, Presses Polytechniques et Universitaires Romandes, 2000.
[57]
M. Davy, S. J. Godsill, J. Idier.
Bayesian Analysis of Polyphonic Western Tonal Music, in: Journal of the Acoustical Society of America, 2006, vol. 119, no 4, p. 2498–2517.
[58]
G. Gravier, F. Yvon, B. Jacob, F. Bimbot.
Sirocco, un système ouvert de reconnaissance de la parole, in: Journées d'étude sur la parole, Nancy, June 2002, p. 273-276.
[59]
C. Herley.
ARGOS: Automatically Extracting repeating objects from multimedia streams, in: IEEE Trans. on Multimedia, February 2006, vol. 8, no 1, p. 115–129.
[60]
F. Jelinek.
Statistical Methods for Speech Recognition, MIT Press , Cambridge, Massachussetts, 1998.
[61]
S. Mallat.
A Wavelet Tour of Signal Processing, 2, Academic Press , San Diego, 1999.
[62]
K. Murphy.
An introduction to graphical models, 2001
http://www.cs.ubc.ca/~murphyk/Papers/intro_gm.pdf.
[63]
C. Ng, R. Wilkinson, J. Zobel.
Experiments in spoken document retrieval using phoneme n-grams, in: Speech Communication, Vol, 2000, vol. 32.
[64]
M. Utiyama, H. Isahara.
A Statistical Model for Domain-Independent Text Segmentation, in: Proceedings of the 39th Annual Meeting of Association for Computational Linguistics, ACL'01, Toulouse, France, July 2001.
[65]
N. Whiteley, A. T. Cemgil, S. J. Godsill.
Sequential Inference of Rhythmic Structure in Musical Audio, in: Proc. of IEEE Int. Conf. on Acoustics, Speech and Signal Processing (ICASSP), 2007, p. 1321–1324.

previous
next