Supervision / Teaching activities

Student supervision (since 2010)


  • PhD candidates
    • Gaël Le Godais, Machine learning techniques for speech synthesis driven by brain actvity, co-supervised by Blaise Yvert (Inserm/BrainTech) and Laurent Girin (Gipsa-lab/INRIA), under way.
    • Mael Pouget - Incremental speech synthesis (co-supervised by Gérard Bailly (Gipsa-lab), defended the 23/06/2017
    • Florent Bocquelet - Toward a Brain Computer Interface for Speech Rehabilitation, co-supervised by Blaise Yvert (Inserm/BrainTech) and Laurent Girin (Gipsa-lab/INRIA), defended the 23/04/2017.
    • Diandra Fabre - Visual articulatory biofeedback based on augmented ultrasound imaging, co-supervised by Pierre Badin, (GIPSA-lab), Mélanie Canault/Nathalie Baudoin (DDL-Lyon), defended the 16/12/2016.
  • Master student (second year)
    • Xiaoou Wang - Master 2 phonetics - Univ. Aix-Marseille - Developement of a pronunciation training method for chinese learner based on the visualization of tongue movements via a virtual talking head (Avril-Septembre 2013).
    • Mael Pouget - PFE Grenoble-INP/Phelma - Real-time implementation of GMM-based mapping for a silent speech interface.
  • Speech pathologist
    • Camille Bach et Lorene Lambourion : Visual biofeedback for speech therapy (2013-2014).
  • Others
    • Gina Yang - Phelma (2014, 3 months) - Automatic extraction of tongue contours in ultrasound images using active models.
    • Remi Vincent - Phelma (2011, 3 months) - Harmonic+noise coding for HMM-based speech synthesis.
    • Nicolas Asin - Master in Statistics - Université de Besancon (2011, 3 months) - HMM-based Text-to-speech synthesis using audio-books as acoustic material.


Teaching activities


Hidden Markov Model and Gaussian Mixture Model, application to automatic speech recognition - MASTER SIGMA (2017)

(slides here)


Real-time audio programming - PHELMA

  • Lecture 1
    • Definition(s) of a "real-time system", classification of RT systems (hard/soft, safe-critical, etc.).
    • Theoritical models: synchronous/scheduled, time-triggered/event-based model
    • Hardware aspects (DSP, GPU, etc.)
  • Lecture 2
    • Common implementation issues in real-time audio programmaming on standard OS - (preemption, scheduling strategies, context switching, priority inversion, memory allocation, etc.)
    • Specific aspects of real-time signal processing (circular buffering, overlap-add, etc.)
  • Lab work
    • implementation of a real-time convolution reverb
    • implementation of a VoIP system based on LPC coding


Speech technologies - ENSIMAG


  • Lecture 1: Automatic Speech recognition (ASR)
    • Introduction
    • Speech analysis/coding for ASR
    • Template-based ASR systems ( DTW, Level-building/one-stage DTW)
    • Introduction to maching learning
    • HMM-based ASR (discrete Markov models, hidden markov model, training/evaluation/decoding, context-dependancy, state-tying, introduction to langage modeling)

  • Lecture 2 : Text-To-Speech synthesis
    • Some history ...
    • Introduction to text analysis for TTS (morpho-syntactic analysis, prosody generation, phonetization)
    • Corpus-based (unit selection) TTS
    • HMM-based TTS

  • Lecture 3 : Multimodal speech technologies
    • Introduction : Speech is "multimodal"
    • Audivisual speech recognition (visual feature extraction, feature/decision fusion).
    • Audivisual speech synthesis (image-based system, model-based system, talking head)
    • Multimodal mapping (goals, practical applications, neural-network based system, GMM-based system).

  • Lecture 4 : Voice transformation and conversion (an introduction)
    • Goals and practical applications
    • Speech transformation (pitch shifting & time stretching, TD-PSOLA, Harmonic+noise model)
    • Voice conversion (GMM-based mapping).

Grenoble Images Parole Signal Automatique laboratoire

UMR 5216 CNRS - Grenoble INP - Université Joseph Fourier - Université Stendhal