Vous êtes ici : GIPSA-lab > Formation > Thèses soutenues
PATRI Jean-Franois

'Bayesian Modeling of Speech Motor Planning: Variability, Multisensory Goals and Perceptuo-Motor Interactions'


Directeur de thèse :     Pascal PERRIER

Co-encadrant :     Jean-Luc SCHWARTZ

École doctorale : Ingénierie pour la santé, la cognition et l''environnement (EDISCE)

Spécialité : Ingénierie de la cognition, de l'interaction, de l'apprentissage et de la création

Structure de rattachement : CNRS

Établissement d'origine : Université Paris VI

Financement(s) : Erc ; Erc ; Erc ; Erc


Date d'entrée en thèse : 01/10/2014

Date de soutenance : 14/06/2018


Composition du jury :
JACQUES DROULEZ Directeur de recherche émérite, ISIR UMR 7222 CNRS-UPMC Paris, Rapporteur.
JOHN HOUDE, associate professor, Dept. of Otolaryngology – Head and Neck Surgery, University of California San Francisco, Rapporteur.
EMMANUEL MAZER Directeur de Recherche, INRIA, Examinateur.
DAVID OSTRY, Professor, Mcgill University, Haskins Laboratories, Examinateur.


Résumé : - Context and goal - It is almost a truism to affirm that one of the main features of speech is its variability: variability inter-gender, inter-speaker, but also variability from one context to another, or from one repetition to another for a given subject. Variability underlies at the same time the beauty of speech, the complexity of its treatment by speech technologies, and the difficulty for understanding its mechanism. In this thesis we study certain aspects of speech variability, our starting point being the variability characterizing the repetitions of a given utterance by a given subject, in a given condition, which we call intrinsic variability. Models of speech motor control have mainly focused on the contextual aspects of speech variability, and have rarely considered its intrinsic component, even though it is this fundamental component of variability that gives speech it naturalness. In the general context of motor control, the precise origin of the intrinsic variability of our movements remains controversial and poorly understood, however, a common assumption is that intrinsic variability would mainly originate from neural and muscular noise in the execution chain. The main goal of this thesis is to address the contextual and intrinsic components of speech variability in an integrative computational framework. To this aim, we postulate that the main component of the intrinsic variability of speech is not just execution noise, but that a fundamental part arises at the level of motor planning, due to the abundance of possible realizations of an intended speech item. - Methodology - We formalize this idea in a probabilistic computational framework, Bayesian modeling, where the abundance of possible realizations of a given speech item is naturally represented as uncertainty, and where variability is thus formally manipulated. We illustrate the pertinence of this approach with three main contributions. - Results - Firstly, we reformulate in Bayesian terms an existing model of speech motor control, the GEPPETO model, and demonstrate that this Bayesian reformulation, which we call B-GEPPETO, contains GEPPETO as a particular case. In particular, we illustrate how the Bayesian approach enables to account for the intrinsic component of speech variability while including the same principles proposed by GEPPETO for the emergence and structuration of its contextual component. Secondly, the Bayesian framework enables to go beyond and extend B-GEPPETO in order to include a multisensory characterization of speech motor goals, with auditory and somatosensory components. We apply this extension to explore variability in the context of compensations to sensory-motor perturbation in speech production. We account for differences in compensation as sensory preferences implemented by modulating the relative contribution of each sensory modality in the model. The somatosensory characterization of speech motor goals involved a certain number of hypotheses that we intended to evaluate with two experimental studies. Finally, in our third contribution we exploit the formalism for the reinterpretation of recent experimental observations concerning perceptual changes following speech motor adaptation to auditory perturbations. This original analysis is made possible thanks to the unified representation of knowledge in the model, which enables to account for production and perception processes in a single computational framework. Taken together, these contributions illustrate how the Bayesian framework offers a structured and systematic approach for the construction of models in cognitive sciences. The framework facilitates the development of models and their progressive complexification by specifying and clarifying underlying assumptions.

GIPSA-lab, 11 rue des Mathématiques, Grenoble Campus BP46, F-38402 SAINT MARTIN D'HERES CEDEX - 33 (0)4 76 82 71 31