Section: Software

Keywords : speech, audio, signal, analysis, processing, audio stream, detection, tracking, segmentation, audio indexing, speaker verification.

SPro+AudioSeg : audio signal processing, segmentation and classification toolkit

Participants : Guillaume Gravier, Mathieu Ben, Daniel Moraru.

The SPro toolkit provides standard front-end analysis algorithms for speech processing. It is systematically used in the METISS group for activities in speech and speaker recognition as well as in audio indexing. The toolkit is developed for Unix environments and is distributed as a free software with a GPL license. It is used by several other french laboratories working in the field of speech processing.

In the framework of our activities on audio indexing and speaker recognition, audioseg, a toolkit for the segmentation of audio streams is developed and maintained. This toolkit provides generic tools for the segmentation and indexing of audio streams under Unix, such as audio activity detection, abrupt change detection, segment clustering, Gaussian mixture modeling and joint segmentation and detection using hidden Markov models. The toolkit relies on the SPro software for feature extraction.

The audioseg toolkit has been used to develop a new speaker verification platform, validated with our participation to the NIST speaker recognition evaluation. It was also extensively used for various work and developments, in particular for the detection of audio events in video sound tracks.

