Team METISS

Members
Overall Objectives
Scientific Foundations
Application Domains
Software
New Results
Contracts and Grants with Industry
Other Grants and Activities
Dissemination
Bibliography

Bibliography

Major publications by the team in recent years

[1]
S. Arberet, R. Gribonval, F. Bimbot.
A Robust Method to Count and Locate Audio Sources in a Stereophonic Linear Instantaneous Mixture, in: Proc. of the Int'l. Workshop on Independent Component Analysis and Blind Signal Separation (ICA 2006), Charleston, South Carolina, USA, J. Rosca, D. Erdogmus, J. Príncipe, S. Haykin (editors), LNCS, Springer, March 2006, vol. 3889, p. 536–543.
[2]
M. Ben.
Approches robustes pour la vérification automatique du locuteur par normalisation et adaptation hiérarchique, Thèse de doctorat, Université de Rennes 1, IRISA, Rennes (France), November 2004.
[3]
L. Benaroya, F. Bimbot, R. Gribonval.
Audio Source Separation With a Single Sensor, in: IEEE Trans. Audio, Speech and Language Processing, January 2006, vol. 14, no 1, p. 191–199.
[4]
F. Bimbot, J.-F. Bonastre, C. Fredouille, G. Gravier, I. Magrin-Chagnolleau, S. Meignier, T. Merlin, J. Ortega-Garcia, D. A. Reynolds.
A tutorial on text-independent speaker verification, in: EURASIP Journal on Applied Signal Processing, April 2004, vol. 2004, no 4, p. 430–451.
[5]
F. Bimbot, G. Gravier.
Evaluation des systèmes de reconnaissance de la parole, in: Evaluation des systèmes de traitement de l'information, Traité des Sciences et Techniques de l'Information, Hermes Science Publications, 2004, chap. 8, p. 189–213.
[6]
L. Borup, R. Gribonval, M. Nielsen.
Bi-framelet systems with few vanishing moments characterize Besov spaces, in: Appl. Comp. Harmonic Anal. (special issue on frames in harmonic analysis), 2004, vol. 17, no 1–2.
[7]
S. Galliano, E. Geoffrois, D. Mostefa, K. Choukri, J.-F. Bonastre, G. Gravier.
The ESTER Phase II Evaluation Campaign for the Rich Transcription of French Broadcast News, in: European Conference on Speech Communication and Technology, 2005, p. 1149–1152.
[8]
S. Galliano, E. Geoffrois, D. Mostefa, K. Choukri, J.-F. Bonastre, G. Gravier.
The ESTER Phase II Evaluation Campaign for the Rich Transcription of French Broadcast News, in: European Conference on Speech Communication and Technology, 2005.
[9]
R. Gribonval, R. M. Figueras i Ventura, P. Vandergheynst.
A simple test to check the optimality of sparse signal approximations, in: EURASIP Signal Processing, special issue on Sparse Approximations in Signal and Image Processing, March 2006, vol. 86, no 3, p. 496–510.
[10]
R. Gribonval, M. Nielsen.
Nonlinear approximation with dictionaries. I. Direct estimates, in: J. of Fourier Anal. and Appl., 2004, vol. 10, no 1.
[11]
R. Gribonval, M. Nielsen.
On approximation with spline generated framelets, in: Constructive Approx., January 2004, vol. 20, no 2, p. 207–232.
[12]
R. Gribonval, P. Vandergheynst.
On the exponential convergence of Matching Pursuits in quasi-incoherent dictionaries, in: IEEE Trans. Information Theory, January 2006, vol. 52, no 1, p. 255–261.
[13]
S. Huet, G. Gravier, P. Sébillot.
Are morpho-syntaxic taggers suitable to improve automatic transcription, in: Intl. Workshop on Text, Speech and Dialogue, 2006.
[14]
E. Kijak, G. Gravier, L. Oisel, P. Gros.
Audiovisual integration for tennis broadcast structuring, in: Multimedia Tools and Application, 2006, vol. 30, no 3, p. 289–312.
[15]
G. Monaci, P. Jost, P. Vandergheynst, B. Mailhé, S. Lesage, R. Gribonval.
Learning Multi-Modal Dictionaries: Application to Audiovisual Data, in: Proc. of International Workshop on Multimedia Content Representation, Classification and Security (MCRCS'06), LNCS, Springer-Verlag, September 2006, vol. 4105, p. 538–545.
[16]
A. Ozerov.
Adaptation de modèles statistiques pour la séparation de sources mono-capteur. Application à la séparation voix / musique dans les chansons., Ph. D. Thesis, Université de Rennes I, December 2006.
[17]
C. G. M. Snoek, M. Worring.
Time Interval Maximum Entropy based Event Indexing in Soccer Video, in: Proc. IEEE International Conference on Multimedia & Expo, July 2003, p. 481–484.
[18]
E. Vincent, R. Gribonval, C. Févotte.
Performance measurement in Blind Audio Source Separation, in: IEEE Trans. Speech, Audio and Language Processing, 2006, vol. 14, no 4, p. 1462–1469.

Publications of the year

Doctoral dissertations and Habilitation theses

[19]
R. Gribonval.
Sur quelques problèmes mathématiques de modélisation parcimonieuse, Habilitation à Diriger des Recherches, spécialité « Mathématiques », Université de Rennes I, octobre 2007.
[20]
S. Lesage.
Apprentissage de dictionnaires structurés pour la modélisation parcimonieuse des signaux multicanaux, Ph. D. Thesis, Université de Rennes I, avril 2007.

Articles in refereed journals and book chapters

[21]
F. Bimbot.
Description des documents sonores, in: L'indexation multimédia : description et recherche automatique, P. Gros (editor), Traité IC2, Hermès, 2007, chap. 5, p. 137–161.
[22]
M. Davies, M. Jafari, S. Abdallah, E. Vincent, M. Plumbley.
3, in: Blind source separation using space-time independent component analysis, T.-W. L. S. Makino, H. Sawada (editors), Springer, 2007.
[23]
M. Delakis, G. Gravier, P. Gros.
Audiovisual Integration with Segment Models for Tennis Video Parsing, in: Computer Vision and Image Understanding, accepted for publication, 2007.
[24]
M. Delakis, G. Gravier, P. Gros.
Stochastic models for multimodal video analysis, to be published, Springer Verlag, 2007.
[25]
F. B. Gilles Gonon.
De la reconnaissance automatique du locuteur à la signature vocale, http://interstices.info, 2007.
[26]
G. Gravier, J.-F. Bonastre, S. Galliano, E. Geoffrois, D. Mostefa, K. Choukri.
Évaluation des systèmes de transcription enrichie d'émissions radiophoniques, in: Les campagnes d'évaluation EVALDA, S. Chaudiron (editor), (à paraître), Hermès Science, 2007.
[27]
G. Gravier.
Description multimodale multimedia, in: L'indexation multimédia : description et recherche automatique, P. Gros (editor), Traité IC2, Hermès, 2007, chap. 7, p. 191–214.
[28]
S. Huet, G. Gravier, P. Sébillot.
Toward the integration of NLP and ASR techniques: POS tagging and transcription, Springer Verlag, 2007.
[29]
S. Krstulovic, F. Bimbot, O. Boëffard, D. Charlet, D. Fohr, O. Mella.
, Selecting Representative Speakers for a Speech Database on the Basis of Heterogeneous Similarity CriteriaC. Müller (editor), Springer, Berlin / Heidelberg, 2007, p. 276–292.
[30]
G. Monaci, P. Jost, P. Vandergheynst, B. Mailhe, S. Lesage, R. Gribonval.
Learning Multi-Modal Dictionaries, in: IEEE Trans. Image Processing, septembre 2007, vol. 16, no 9, p. 2272-2283.
[31]
A. Ozerov, P. Philippe, F. Bimbot, R. Gribonval.
Adaptation of Bayesian models for single channel source separation and its application to voice / music separation in popular songs, in: IEEE Trans. Audio, Speech and Language Processing, juillet 2007, vol. 15, no 5, p. 1564–1578.
[32]
A. Ozerov, P. Philippe, R. Gribonval, F. Bimbot.
Choix et adaptation de modèles statistiques pour la séparation de voic chantée à partir d'un seul microphone, in: Revue Française de Traitement du Signal, 2007, vol. 24, no 3, p. 211–224.
[33]
E. Vincent, R. Gribonval, M. Plumbley.
Oracle estimators for the benchmarking of source separation algorithms, in: Signal Processing, 2007, vol. 87, no 8, p. 1933–1950.
[34]
E. Vincent, M. Plumbley.
Low bitrate object coding of musical audio using bayesian harmonic models, in: IEEE Trans. on Audio, Speech and Language Processing, 2007, vol. 15, no 4, p. 1273–1282.

Publications in Conferences and Workshops

[35]
S. Arberet, R. Gribonval, F. Bimbot.
A Robust Method to Count and Locate Audio Sources in a Stereophonic Linear Anechoic Mixture, in: Proc. IEEE Intl. Conf. Acoust. Speech Signal Process (ICASSP'07), avril 2007.
[36]
D. Charlet, M. Collet, F. Bimbot.
VZ-Norm : an extension of Z-norm to the Multivariate Case for Anchor Model based Speaker Verification, in: European Conf. on Speech Communication and Technology – Interspeech, 2007.
[37]
G. Gravier, D. Moraru.
Towards phonetically-driven hidden Markov models: Can we incorporate phonetic landmarks in HMM-based ASR?, in: Proc. ISCA Tutorial and Research Workshop on Non Linear Speech Processing, M. C. et al. (editor), Lecture Notes in Artificial Intelligence, Springer Verlag, 2007, vol. 4885, p. 161–168.
[38]
R. Gribonval, B. Mailhé, H. Rauhut, K. Schnass, P. Vandergheynst.
Multichannel thresholding with sensing dictionaries, in: Computational Advances in Multi-Sensor Adaptive Processing (CAMSAP'07), 2007.
[39]
R. Gribonval, B. Mailhe, H. Rauhut, K. Schnass, P. Vandergheynst.
Average Case Analysis of Multichannel Thresholding, in: Proc. IEEE Intl. Conf. Acoust. Speech Signal Process (ICASSP'07), avril 2007.
[40]
S. Huet, G. Gravier, P. Sébillot.
Morphosyntactic processing of N-best lists for improved recognition and confidence measure computation, in: European Conf. on Speech Communication and Technology – Interspeech, 2007.
[41]
D. Moraru, G. Gravier.
Landmark Based Large Vocabulary Continuous Speech Recognition, in: Proc. Conf. on Speech Technology and Human-Computer Dialogue, 2007.
[42]
K. Schnass, P. Vandergheynst, R. Gribonval, H. Rauhut.
Average case analysis of multichannel sparse approximations using p-thresholding, in: SPIE Optics and Photonics, Wavelet XII, San Diego, 2007.
[43]
R. Tavenard, L. Amsaleg, G. Gravier.
Estimation de similarité entre séquences de descripteurs à l'aide de machines à vecteurs supports, in: Proc. Conf. Base de Données Avancées, 2007.
[44]
R. Tavenard, L. Amsaleg, G. Gravier.
Machines à vecteurs supports pour la comparaison de séquences de descripteurs, in: Proc. Journées d'étude et d'échange COmpression et REprésentation des Signaux Audiovisuels, 2007.
[45]
W. Teng, G. Gravier, F. Bimbot, F. Soufflet.
Rapid Speaker Adaptation by Reference Model Interpolation, in: European Conf. on Speech Communication and Technology – Interspeech, 2007.
[46]
E. Vincent, N. Bertin, R. Badeau.
Two non-negative matrix factorization methods for polyphonic pitch transcription, in: Proc. Music Information Retrieval Evaluation eXchange (MIREX), 2007.
[47]
E. Vincent, R. Gribonval.
Blind criterion and oracle bound for instantaneous audio source separation using adaptive time-frequency representations, in: Proc. IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), 2007.
[48]
E. Vincent.
Complex nonconvex lp norm minimization for underdetermined source separation, in: Proc. Int. Conf. on Independent Component Analysis and Blind Source Separation (ICA), 2007, p. 430–437.
[49]
E. Vincent, H. Sawada, P. Bofill, S. Makino, J. Rosca.
First stereo audio source separation evaluation campaign: data, algorithms and results, in: Proc. Int. Conf. on Independent Component Analysis and Blind Source Separation (ICA), 2007, p. 552–559.
[50]
S. Welburn, M. Plumbley, E. Vincent.
Object-coding for resolution-free musical audio, in: Proc. AES Int. Conf. on new directions in high resolution audio, 2007.

Internal Reports

[51]
V. Allie-Crocitti, M. Ben, F. Bimbot, S. Busson, J. Chico, D. Degraeve, W. D. Neve, G. Gonon, G. Nuyttens, P. Schmouker, C. Serre, D. V. Deursen.
WP3 Status Report on Indexation and Content Processing Activities, Technical report, 2007.
[52]
L. Borup, R. Gribonval, M. Nielsen.
Beyond coherence : recovering structured time-frequency representations, Technical report, IRISA, feb 2007, no 1833.
[53]
R. Gribonval, H. Rauhut, K. Schnass, P. Vandergheynst.
Atoms of all channels, unite! Average case analysis of multi-channel sparse recovery using greedy algorithms, Preprint, IRISA, mai 2007, no PI 1848.
[54]
E. Vincent, M. Plumbley.
Efficient Bayesian inference for harmonic models via adaptive posterior factorization, Technical report, 2007, no PI 1841.

References in notes

[55]
J. Bobin, Y. Moudden, J.-L. Starck, M. Elad.
Morphological Diversity and Source Separation, in: IEEE Signal Processing Letters, 2006, no 7, p. 409–412.
[56]
R. Boite, H. Bourlard, T. Dutoit, J. Hancq, H. Leich.
Traitement de la Parole, Presses Polytechniques et Universitaires Romandes, 2000.
[57]
J.-F. Bonastre, F. Bimbot, L.-J. Boë, J. Campbell, D. Reynolds, I. Magrin-Chagnolleau.
Person Authentication by Voice : A Need For Caution, in: Proc. Eurospeech'03, Genève, 2003.
[58]
G. F. Cooper, E. Herskovits.
A Bayesian method for the induction of probabilistic networks from data, in: machine Learning, 1992.
[59]
G. Gravier, F. Yvon, B. Jacob, F. Bimbot.
Sirocco, un système ouvert de reconnaissance de la parole, in: Journées d'étude sur la parole, Nancy, June 2002, p. 273-276.
[60]
F. Jelinek.
Statistical Methods for Speech Recognition, MIT Press , Cambridge, Massachussetts, 1998.
[61]
P. Jost, P. Vandergheynst, S. Lesage, R. Gribonval.
Learning redundant dictionaries with translation invariance property : the MoTIF algorithm, in: SPARS, Rennes, 2005.
[62]
S. Lesage, S. Krstulovic, R. Gribonval.
Séparation de sources dans le cas sous-déterminé : comparaison de deux approches basées sur des décompositions parcimonieuses, in: Proc. GRETSI, 2005.
[63]
Z. Luo, M. Gaspar, J. Liu, A. Swami.
Distributed signal processing in sensor networks, in: IEEE Signal processing magazine, July 2006, vol. 23, no 4, p. 14-15.
[64]
S. Mallat.
A Wavelet Tour of Signal Processing, 2, Academic Press , San Diego, 1999.
[65]
A. Ozerov, R. Gribonval, P. Philippe, F. Bimbot.
Séparation voix / musique à partir d'enregistrements mono : quelques remarques sur le choix et l'adaptation des modèles, in: Proc. GRETSI, 2005.
[66]
A. Ozerov, P. Philippe, R. Gribonval, F. Bimbot.
One microphone singing voice separation using source-adapted model, in: Proc. WASPAA, 2005.
[67]
M. Utiyama, H. Isahara.
A Statistical Model for Domain-Independent Text Segmentation, in: Proceedings of the 39th Annual Meeting of Association for Computational Linguistics, ACL'01, Toulouse, France, July 2001.
[68]
E. Vincent, R. Gribonval.
Construction d'estimateurs oracles pour la séparation de sources, in: Proc. GRETSI, 2005.

previous
next