Synchronous acquisition of high-speed ultrasound video and audio speech data



Measuring the activity of the vocal tract during speech is critical in a variety of fields such as phonology, linguistics, speech pathology, anatomy and multimodal speech processing. In recent years and despite the success of MRI, the use of ultrasound for vocal tract imaging and analysis remains popular mainly because of its non-invasive property, its good time resolution, its clinical safety, and its ability to image the tongue in non-supine subjects. Several systems able to acquire a sequence of ultrasound images of the tongue together with the uttered speech signal have been described in the literature. However, the coupling of an ultrasound imaging system with another imaging device, such as a high-speed camera, without decreasing the acquisition framerate, remains a difficult problem.In that purpose, I develop the Ultraspeech acquisition system, which in addition to the acoustic signal, is able to synchronously record both ultrasound and video streams at more than 60 fps on a single and “easy-to-transport” laptop-based machine. Ultraspeech is compatible with the Terason T3000 portable ultrasound imaging system, the Telemed (Echoblaster et MicroUS), the industrial cameras provided by Imaging Source, and ASIO-compatible soundcards. Ultraspeech is free to download at


Check out these sequences recorded using Ultraspeech !




Intuitive visualization of ultrasound articulatory data for speech therapy and pronunciation training


Ultraspeech-player is a standalone software dedicated to the visualization of ultrasound speech data recorded using Ultraspeech. Ultraspeech-player is designed for pronunciation training in the context of speech therapy and second language learning. The software aims at displaying natural tongue movements acquired on a reference speaker, for different kind of sequences (isolated vowels, VCV, swallowing, etc.). It also includes an audiovisual time-stretching module allowing the user to slow-down both the articulatory gesture and its corresponding acoustic realization (i.e. the speed of the audio signal is modified while the original pitch it preserved). This rendering technique aims at improving the way a naïve speaker perceive and understand a tongue gesture. Free to download from Ultraspeech-player webpage !




Grenoble Images Parole Signal Automatique laboratoire

UMR 5216 CNRS - Grenoble INP - Université Joseph Fourier - Université Stendhal