Team manager : Gérard BAILLYThomas HUEBER

Research Areas

Speech processing

Speech processing

This  axis gathers different studies on traditional aspects of speech processing (analysis, coding, recognition, synthesis) with a special focus on:

  • Speech production 
  • Incremental speech synthesis

Speech production 

The goal is here to better understand the underlying mechanisms of speech production and notably the complex relationships between the articulatory movements, the geometry of the vocal tract and the speech acoustics. Our studies rely on large datasets recorded on the BEDEI experimental platform with different experimental techniques

such as, 3D electromagnetic articulography, MRI, and ultrasound imaging (Ultraspeech).

Those multimodal articulatory data can be combined together to build a so-called "articulatory talking head" which can be seen as a virtual clone of a specific speaker (right). This tool provides a flexible and intuitive visualisation of all articulators including the tongue and the velum during speech. It can be used for speech therapy or second language learning (as proposed in the ANR Artis and the Vizart3D projects). 


Incremental Text-to-speech synthesis 

This research project aims at developing an incremental Text-To-Speech system (iTTS) in order to improve the user experience of people with communication disorders who use a TTS system in their daily life. Contrary to a conventional TTS, an iTTS system aims at delivering the synthetic voice while the user is typing (eventually with a delay of one word), and thus before the full sentence is available. By reducing the latency between text input and speech output, iTTS should enhance the interactivity of communication (project SpeakRighNow).


GIPSA-lab, 11 rue des Mathématiques, Grenoble Campus BP46, F-38402 SAINT MARTIN D'HERES CEDEX - 33 (0)4 76 82 71 31