BAILLY

Gérard

Directeur de Recherche CNRS

Publications

Books
Conferences organized
Special sessions
Guest editor
Peer-reviewed journals
Book chapters
Invited talks
International conferences
National conferences
Academic documents

Referenced publications: Google_scholar, Research Gate

Books

Bailly G., P. Perrier & Vatikiotis-Bateson, E. (2012) Audiovisual Speech Processing. Cambridge, UK. Cambridge University Press, 506 pages. (HAL)
Keller E., G. Bailly, A. I. C. Monaghan, J. Terken & M. Huckvale (2002) Improvements in Speech Synthesis. Chichester, England, J. Wiley & Sons, Ltd, 393 pages.
Bailly, G. & C. Benoît (1992) Talking Machines: Theories, Models and Designs. Amsterdam, North-Holland, 523 pages.

Conferences & workshops organized

Bailly G., O. Perrotin, T. Hueber, D. Lolive & N. Obin (2023) 12th Speech Synthesis Workshop, 26-28 August, Grenoble - France.
Perrotin P., Bailly G. & S. King (2023) Blizzard Challenge, 29 August, Grenoble - France.
Bailly G., L. Hiatt, K. Jokinen, T. Kawahara, R. Moore & E. A. Topp (2018) AI-MHRI: AI for Multimodal Human Robot Interaction Workshop at the Federated AI Meeting (FAIM), 14-15 July, Stockholm - Sweden.
Badin P., G. Bailly, D. Demolin, F. Raby & T. Hueber (2013) Speech and Language Technology in Education, 30-31 august, Grenoble - France.
Schwartz J.-L., M. Dohen, G. Bailly & P. Perrier (2008) Speech and face-to-face communication, 27-29 october, Grenoble - France.
Bailly G. & J.-L. Crowley (2005) Smart objects and ambient intelligence. 12-14 october, Grenoble - France.
Bailly G. & P. Badin (2000) Les 23èmes Journées d'Étude sur la Parole. In Cognito. 19:1-2.
Bailly G. & C. Benoît (1990) ETRW Workshop on Speech Synthesis, Autrans - France.

Special sessions

Bailly G., G. Skantze & S. Al Moubayed (2017) Speech and HRI, Interspeech, Stockholm, Sweden.
Kim J. & G. Bailly (2016) Auditory-visual expressive speech and gesture in humans and machines, Interspeech, San Francisco, CA.
Fagel S., B. J. Theobald & G. Bailly (2009) LIPS 2009: Visual Speech Synthesis Challenge, AVSP, Norwich - UK.
Fagel S., B. J. Theobald & G. Bailly (2008) LIPS 2008: Visual Speech Synthesis Challenge, Interspeech, Brisbane - Australia.
Bailly G. (2008) Talking heads and pronunciation training, Interspeech, Brisbane - Australia.
Bailly G. & G. Potamianos (2008) Multimodal speech technology, Acoustics, Paris.
Bailly G., N. Campbell & B. Mobius (2003) Hot Topics in Speech Synthesis, EuroSpeech, Geneva, Switzerland.

Guest editor

Kim J., G. Bailly & C. Davis (2018) "Auditory-visual expressive speech and gesture in humans and machines", Speech Communication, (CfP).
Dohen M., Schwartz J.-L. & G. Bailly (2010) "Speech and Face-to-Face Communication", Speech Communication, 52(3): 598–612. (HAL; .pdf).
Fagel S., G. Bailly & B. J. Theobald (2009) "Animating Virtual Speakers or Singers from Audio: Lip-Synching Facial Animation", EURASIP Journal on Audio, Speech, and Music Processing, 2009(ID 826091): 2 pages. (HAL)
Campbell N., W. Hamza, H. Höge, J. Tao & G. Bailly (2006) "Special section on expressive speech synthesis", IEEE Trans. on Audio, Speech, and Language Processing, 14(4): p.1097-1098.

Reviewed journal articles

Godde E., C. Boggio, A. Favre-Felix, A. Briglia, D. Charuau, M.-L. Bosse & G. Bailly (submitted) "Improving Fluency of Young L1 Readers: Large-Scale Evaluation of a Computer-Assisted Training Program Based on Repeated Karaoke-Enhanced Reading-while-Listening of Texts", Computers and Education.
Lenglet M., O. Perrotin & G. Bailly (submitted) "A Closer Look at Internal Representations of End-to-End Text-to-Speech Models: How is Phonetic and Acoustic Information Encoded?", Computer, Speech and Language. Available at SSRN
Birulès J., A. Duroyal, A. Vilain, G. Bailly & M. Fort (accepted) "French speakers favor prosody over statistics to segment speech", Language Learning. DOI
Fournier H., S. Alisamir, S. Azzakhnini, I. Zsoldos, E. Trân, G. Bailly, F. Elisei, B. Bouchot, B., P. Constant, J. Fruitet, F. Tarpin-Bernard, S. Rossato, F. Portet, O. Koenig, H. Chainay & F. Ringeval (2025) "THERADIA WoZ: An Ecological Corpus for Appraisal-based Affect Research in Healthcare", IEEE Transactions on Affective Computing, DOI (HAL)
Haefflinger L., F. Elisei & G. Bailly (2025) "Data-Driven Control of Eye and Head movements for Triadic Human-Robot Interactions", International Journal of Social Robotics. DOI (HAL)
Perrotin O., B. Stephenson, S. Gerber, G. Bailly & S. King (2025) "Refining the Evaluation of Speech Synthesis: A Summary of the Blizzard Challenge 2023", Computer, Speech & Language, 90, 101747. DOI (HAL)
Zsoldos I., E. Trân, H. Fournier, F. Tarpin-Bernard, J. Fruitet, M. Fouillen, G. Bailly, F. Elisei, B. Bouchot, P. Constant, F. Ringeval, O. Koenig & H. Chainay (2024) "The value of a virtual assistant to improve engagement in computerized cognitive training at home: An exploratory study", JMIR Rehabilitation and Assistive Technologies, 11, e48129. DOI (HAL)
Bailly G., E. Godde, A.-L. Piat-Marchand & M.-L. Bosse (2022) "Automatic assessment of aloud readings of young pupils", Speech Communication, 138:67-79. DOI. (HAL)
Godde E., G. Bailly & M.-L. Bosse (2022) "Pausing and breathing while reading aloud: development from 2nd to 7th grade", Reading and Writing, 35:1-27. DOI. (HAL)
Godde E., M.-L. Bosse & G. Bailly (2021) "Echelle Multi-Dimensionnelle de Fluence: nouvel outil d'évaluation de la fluence en lecture prenant en compte la prosodie, étalonné du CE1 à la 5ème", L'année psychologique, 12:19-43, DOI. (HAL)
Godde E., M.-L. Bosse & G. Bailly (2020) "A review of reading prosody acquisition and development", Reading and Writing, 33(2), 399-426. DOI. (HAL)
Kim J., G. Bailly & C. Davis (2018) "Introduction to the Special Issue on Auditory-visual expressive speech and gesture in humans and machines", Speech Communication, 63-67. DOI. (HAL)
Gerbier E., G. Bailly & M.-L. Bosse (2018) "Audiovisual Synchronization in Reading while Listening to Texts: Effects on Visual Behavior and Verbal Learning", Computer, Speech and Language, 47:74-92. DOI. (HAL)
Barbulescu A., R. Ronfard & G. Bailly (2017) "Exercises in Speaking Style: A Generative Audiovisual Prosodic Model for Virtual Actors", Computer Graphics Forum, 37-6:40-51. DOI. (HAL)
Barbulescu A., R. Ronfard & G. Bailly (2017) "Which prosodic features contribute to the recognition of dramatic attitudes?", Speech Communication, 95:78-86. DOI. (HAL)
Nguyen, D.A., G. Bailly & F. Elisei (2017) "Learning Off-line vs. On-line Models of Interactive Multimodal Behaviors with Recurrent Neural Networks", Pattern Recognition Letters, 100C:29-36. DOI. (HAL)
Bailly G. (2016) "Critical review of the book Gaze in Human-Robot Communication", Journal on Multimodal User Interfaces, 1-2. DOI. (HAL)
Mihoub A., G. Bailly, C. Wolf & F. Elisei (2016) "Graphical models for social behavior modeling in face-to face interaction" Pattern Recognition Letters, 74:82-89. DOI. (HAL)
Parmiggiani A., M. Randazzo, M. Maggiali, G. Metta, F. Elisei & G. Bailly (2015) "Design and Validation of a Talking Face for the iCub", International Journal of Humanoid Robotics, 1550026:1-20, DOI. (HAL)
Hueber T., L. Girin, X. Alameda & G. Bailly(2015) "Speaker-adaptive acoustic-articulatory inversion using cascaded Gaussian mixture regression", Transactions on Audio, Speech and Language Processing, 23(12): 2246-2259. DOI. (HAL)
Mihoub A., G. Bailly, C. Wolf & F. Elisei (2015) "Learning multimodal behavioral models for face-to-face social interaction", Journal on Multimodal User Interfaces (JMUI) 9(3): 195-210. DOI. (HAL)
Hueber T. & G. Bailly (2015) "Statistical Conversion of Silent Articulation into Audible Speech using Full-Covariance HMM", Computer, Speech and Language, 36: 274–293. DOI. (HAL)
Badin P., C. Savariaux, G. Bailly, F. Elisei & L.-J. Boë (2012) "Caractérisation des mécanismes de production de la parole: une approche biométrique et modélisatrice mono-locuteur et multi-dispositifs", Biométrie Humaine et Anthropologie - Hommage à Bernard Teston, 30(1-2): 67-77. (HAL)
Boucher J.-D., U. Pattacini, A. Lelong, G. Bailly, P. F. Dominey, F. Elisei, S. Fagel & J. Ventre-Dominey (2012) "I reach faster when I see you look: Gaze effects in human-human and human-robot face-to-face cooperation", Frontiers in neurorobotics 6(3). DOI. (HAL; .pdf).
Heracleous P., P. Badin, G. Bailly & N. Hagita (2011) "A pilot study on augmented speech communication based on Electro-Magnetic Articulography", Pattern Recognition Letters, 32: 1119-1125. DOI. (HAL).
Bailly, G., S. Raidt & F. Elisei (2010) "Gaze, conversational agents and face-to-face communication", Speech Communication - special issue on Speech and Face-to-Face Communication, 52(3): 598–612. (HAL; .pdf).
Badin P., Y. Tarabalka, F. Elisei & G. Bailly (2010) "Can you read tongue movements? Evaluation of the contribution of tongue display to speech understanding", Speech Communication - special issue on Speech and Face-to-Face Communication, 52(3): 493-503. (HAL; .pdf).
Tran V.-A., G. Bailly & H. Loevenbruck (2010) "Improvement to a NAM-captured whisper-to-speech system", Speech Communication - special issue on Silent Speech Interfaces, 52(4): 314-326. (HAL;.pdf).
Bailly G., O. Govokhina, F. Elisei & G. Breton (2009) "Lip-synching using speaker-specific articulation, shape and appearance models", Journal of Acoustics, Speech and Music Processing. Special issue on "Animating Virtual Speakers or Singers from Audio: Lip-Synching Facial Animation", ID 769494: 11 pages. ( .pdf).
Heracleous P., D. Beautemps, V.-A. Tran, H. Loevenbruck & G. Bailly (2009) "Exploiting visual information for NAM recognition", IEICE Electronics Express, 6(2):77-82. (HAL;.pdf).
Bailly G., F. Elisei & S. Raidt (2008) "Boucles de perception-action et interaction face-à-face" Revue Française de Linguistique Appliquée, XIII(2):121-131. (HAL;.pdf).
Badin P., F. Elisei, G. Bailly, C. Savariaux, A. Serrurier & Y. Tarabalka (2007) "Têtes parlantes audiovisuelles virtuelles: Données et modèles articulatoires - applications" Revue de Laryngologie, 128(5), 289-295. (.pdf).
Caplier A., S. Stillittano, O. Aran, L. Akarun, G. Bailly, D. Beautemps, N. Aboutabit, N. & T. Burger (2007) "Image and video for hearing impaired people", EURASIP Journal on Image and Video Processing (electronic journal), 2007, 14 p. (.pdf).
Bailly G., C. Baras, P. Bas, S. Baudry, D. Beautemps, R. Brun, J.-M. Chassery, F. Davoine, F. Elisei, G. Gibert, L. Girin, D. Grison, J.-P. Léoni, J. Liénard, N. Moreau & P. Nguyen (2007) "ARTUS: synthesis and audiovisual watermarking of the movements of a virtual agent interpreting subtitling using cued speech for deaf televiewers", AMSE - Advances in Modelling, 67: 177-187. (.pdf).
Bérar M., M. Desvignes, G. Bailly & Y. Payan (2006) "3D semi landmarks-based statistical face reconstruction", International Journal of Computing and Information Technology, 14(1): 31-43. (.pdf).
Bailly G. & B. Holm (2005) "SFC: a trainable prosodic model", Speech Communication, Special issue on Quantitative Prosody Modelling for Natural Speech Description and Generation - Edited by K. Hirose, D. Hirst and Y. Sagisaka) 46(3-4): 348-364. (.pdf).
Gibert G., G. Bailly, D. Beautemps, F. Elisei & R. Brun (2005) "Analysis and synthesis of the 3D movements of the head, face and hands of a speech cuer", Journal of the Acoustical Association of America, 118(2): 1144-1153. (DOI, .pdf).
Odisio M., G. Bailly & F. Elisei (2004) "Tracking talking faces with shape and appearance models", Speech Communication, 44: 63-82. (DOI, .pdf).
Bailly G., M. Bérar, F. Elisei & M. Odisio (2003) "Audiovisual speech synthesis", International Journal of Speech Technology, 6: 331-346. (DOI, .pdf).
Bailly G. (2003) "Close shadowing natural versus synthetic speech", International Journal of Speech Technology, 6(1): 11-19. (DOI, .pdf).
Apostol L., P. Perrier & G. Bailly (2003) "A model of acoustic interspeaker variability based on the concept of formant-cavity affiliation", Journal of Acoustical Society of America, 115(1): 337-351. (DOI, .pdf).
Bailly G. & B. Holm (2002) "Learning the hidden structure of speech: from communicative functions to prosody", Cadernos de Estudos Linguisticos, 43: 37-54. (DOI)
Badin P., G. Bailly, L. Revéret, M. Baciu, C. Segebarth & C. Savariaux (2002) "Three-dimensional linear articulatory modeling of tongue, lips and face based on MRI and video images", Journal of Phonetics, 30(3): 533-553. (DOI, .pdf).
Morlec Y., G. Bailly & V. Aubergé (2001) "Generating prosodic attitudes in French: data, model and evaluation", Speech Communication, 33(4): 357-371. (DOI, .pdf).
Beautemps D., P. Badin & G. Bailly (2001) "Degrees of freedom in speech production: analysis of cineradio- and labio-films data for a reference subject, and articulatory-acoustic modeling", Journal of the Acoustical Society of America, 109(5): 2165-2180. (DOI, .pdf).
Mawass K., P. Badin & G. Bailly (2000) "Synthesis of French Fricatives by Audio-Video to Articulatory Inversion", Acta Acoustica, 86: 136-146. (.pdf).
Yvon F., P. Boula de Mareuil, C. d'Alessandro, V. Aubergé, M. Bagein, G. Bailly, F. Béchet, S. Foukia, J.-F. Goldman, E. Keller, D. O'Shaughnessy, V. Pagel, F. Sannier, J. Véronis & B. Zellner (1998) "Objective evaluation of grapheme to phoneme conversion for text-to-speech synthesis in French", Computer Speech and Language, 12: 393-410. (HAL, .pdf)
Bailly G. (1998) "Cortical dynamics and biomechanics", Bulletin de la Communication Parlée, 4: 35-44.
Bailly G. (1997) "Learning to speak. Sensori-motor control of speech movements", Speech Communication, 22(2-3): 251-267. (DOI,.pdf).
Barbosa P. & G. Bailly (1994) "Characterisation of rhythmic patterns for text-to-speech synthesis", Speech Communication, 15: 127-137. (.ps).
Bonnyman J., K. M. Curtis & G. Bailly (1993) "A transputer-based recurrent neural network for resonance tracking of speech", Transputer Applications and Systems, 1: 1219-1228.
Boë L.-J., P. Perrier & G. Bailly (1992) "The geometric vocal tract variables controlled for vowel production: Proposals for constraining acoustic-to-articulatory inversion", Journal of Phonetics, 20(1): 27-38.
Bailly G. & M. Alissali (1992) "COMPOST: a server for multilingual text-to-speech system", Traitement du Signal, 9(4): 359-366.
Laboissière R., J.-L. Schwartz & G. Bailly (1991) "Modelling the speaker-listener interaction in a quantitative model for speech motor control: a framework and some preliminary results", PERILUS XIV - Department of Linguistics: 57-62.
Bailly G., R. Laboissière & J.-L. Schwartz (1991) "Formant trajectories as audible gestures: An alternative for speech synthesis", Journal of Phonetics, 19(1): 9-23.
Bailly G. & J. Liu (1990) "Détection d'indices par quantification vectorielle et réseaux Markoviens", Journal d'Acoustique, 3: 143-151.
Bailly G. (1989) "Integration of rhythmic and syntactic constraints in a model of generation of French prosody", Speech Communication, 8: 137-146.
Bailly G., A. Perrin & Y. Lepage (1988) "Common approaches in speech synthesis and automatic translation of text", Bulletin du Laboratoire de la Communication Parlée - Grenoble, 2B: 295-311.
Dakkak O. A., G. Murillo, G. Bailly & B. Guérin (1988) "A database of formant parameters for knowledge extraction and synthesis-by-rule", Bulletin du Laboratoire de la Communication Parlée, 391-405.

Book chapters

Bailly, G., A. Mihoub, C. Wolf & F. Elisei (2018). Gaze and face-to-face interaction: from multimodal data to behavioral models. in Advances in Interaction Studies. Eye-tracking in interaction. Studies on the role of eye gaze in dialogue. G. Brône & B. Oben. Amsterdam, NL. John Benjamins: 139-168. DOI. (HAL)
Bailly, G., P. Badin, L. Revéret and A. B. Youssef (2012). Sensorimotor characteristics of speech production. Audiovisual Speech Processing. G. Bailly, P. Perrier and E. Vatikiotis-Bateson. Cambridge, UK, Cambridge University Press: 368-396. (HAL)
Bailly, G., F. Elisei & S. Raidt (2011). Des machines parlantes aux agents conversationnels incarnés, in Informatique et Sciences Cognitives : influences ou confluences, D. Kayser and C. Garbay, Editors. Ophrys: Paris: 215-234. (HAL)
Lelong, A. and G. Bailly (2011). Study of the phenomenon of phonetic convergence thanks to speech dominoes Analysis of Verbal and Nonverbal Communication and Enactment: The Processing Issue. A. Esposito, A. Vinciarelli, K. Vicsi, C. Pelachaud and A. Nijholt. Berlin, Springer Verlag: 280-293. (HAL)
Fagel, S. & G. Bailly (2010). Speech, gaze and head motion in a face-to-face collaborative task. Toward Autonomous, Adaptive, and Context-Aware Multimodal Interfaces: Theoretical and Practical Issue. A. Esposito, A. M. Esposito, R. Martone, V. C. Müller and G. Scarpetta. Berlin, Springer Verlag, Lecture Notes in Computer Science (LNCS). 6456: 265-274. (HAL)
Bailly, G., P. Badin, D. Beautemps & F. Elisei (2010). Speech technologies for augmented communication, in Computer-Synthesized Speech Technologies: Tools for Aiding Impairment. J. Mullenix and S. Stern. Hershey, PA, IGI Global: 116-128. (HAL)
Attina, V., G. Gibert, M.-A. Cathiard, G. Bailly & D. Beautemps (2010) The Analysis of French Cued Speech Production-Perception: Towards a complete Text-to-Cued Speech Synthesizer, in Cued Speech and Cued Language Development for Deaf and Hard of Hearing Children, C. LaSasso, K.L. Crain, and J. Leybaert, Editors. Plural Publishing, Inc.: San Diego, CA. p. 449-466.(HAL)
Bailly, G. & C. Pelachaud (2009). Parole et expression des émotions sur le visage d'humanoïdes virtuels. in Traité de la réalité virtuelle: Volume 5 : les humains virtuels. P. Fuchs, G. Moreau and S. Donikian. Paris, Presses de l'Ecole des Mines de Paris. 5: 187-208. (.pdf)
Badin, P., F. Elisei, G. Bailly & Y. Tarabalka (2008). An audiovisual talking head for augmented speech generation: Models and animations based on a real speaker's articulatory data. Conference on Articulated Motion and Deformable Objects, Mallorca, Spain, Springer LNCS, 132-143. (.pdf)
d'Alessandro, C., P. Boula de Mareüil, M.-N. Garcia, G. Bailly, M. Morel, A. Raake, F. Béchet, J. Véronis and R. Prudon (2008). La campagne EvaSy d'évaluation de la synthèse de la parole à partir du texte. L'évaluation des technologies de traitement de la langue. S. Chaudiron and K. Choukri. Paris, Hermès: 183-208.
Bailly, G., F. Elisei & S. Raidt (2006). Virtual talking heads and ambiant face-to-face communication. The fundamentals of verbal and non-verbal communication and the biometrical issue. A. Esposito, E. Keller, M. Marinaro and M. Bratanic. Amsterdam, IOS Press BV. (.pdf)
Bérar, M., G. Bailly, M. Chabanas, M. Desvignes, F. Elisei, M. Odisio & Y. Pahan (2006) Towards a generic talking head, in Towards a better understanding of speech production processes, J. Harrington and M. Tabain, Editors. Psychology Press: New York, 341-362. (.pdf)
Bailly, G. (2001). Towards more versatile signal generation systems. Improvements in Speech Synthesis. E. Keller, G. Bailly, A. I. C. Monaghan, J. Terken & M. Huckvale. Chichester, England, J. Wiley & Sons, Ltd: 18-21. (.pdf)
Bailly, G. (2001). The cost258 signal generation test array. Improvements in Speech Synthesis. E. Keller, G. Bailly, A. I. C. Monaghan, J. Terken & M. Huckvale. Chichester, England, J. Wiley & Sons, Ltd: 39-51.(.pdf)
Bailly, G. (2001). A parametric harmonic + noise model. Improvements in Speech Synthesis. E. Keller, G. Bailly, A. I. C. Monaghan, J. Terken and M. Huckvale. Chichester, England, J. Wiley & Sons, Ltd: 22-38.(.pdf)
Barbosa, P. & G. Bailly (1997). Generation of pauses within the z-score model. Progress in Speech Synthesis. J. P. H. V. Santen, R. W. Sproat, J. P. Olive and J. Hirschberg. New York, Springer Verlag: 365-381.
Bailly, G., V. Aubergé & Y. Morlec (1997). Des représentations cognitives aux représentations phonétiques de l'intonation. Polyphonie pour Iván Fónagy. J. Perrot, L'Harmattan: 19-28.
Bailly, G. & V. Aubergé (1997). Phonetic and phonological representations for intonation. Progress in Speech Synthesis. J. P. H. V. Santen, R. W. Sproat, J. P. Olive and J. Hirschberg. New York, Springer Verlag: 435-441. (.ps)
Bailly, G. (1997). No future for comprehensive models of intonation? Computing prosody: Computational models for processing spontaneous speech. Y. Sagisaka, N. Campbell and N. Higuchi, Springer Verlag: 157-164. (.pdf)
Bailly, G. (1996). Pistes de recherches en synthèse de la parole. Fondements et perspectives en traitement automatique de la parole. H. Méloni. Paris - France, AUPELF-UREF: 109-122.
Bailly, G. (1995). Caracterisation of formant trajectories by tracking vocal tract resonances. Levels in speech communication : relations and interactions. C. Sorin, J. Mariani, H. Méloni and J. Schoentgen. Amsterdam, Elsevier: 91-102. (.ps)
Bailly, G., T. Barbe & H. Wang (1992). Automatic labelling of large prosodic databases: tools, methodology and links with a text-to-speech system. Talking Machines: Theories, Models and Designs. G. Bailly and C. Benoît. Amsterdam, North-Holland: 323-333.
Bailly, G., C. Abry, L.-J. Boë, R. Laboissière, P. Perrier & J.-L. Schwartz (1992). Inversion and speech recognition. Signal Processing VI: Theories and Applications. J. Vandewalle, R. Boîte, M. Moonen and A. Oosterlinck. Amsterdam, Elsevier. 1: 159-164.
Bailly, G. (1989). Synthèse de la parole. La parole et son traitement automatique. J.-P. Tubach. Paris, Masson: 408-448.

Invited talks

Bailly, G. (2025) Téléopération et apprentissage de comportements sociaux pour un robot humanoïde conversationnel: des données aux modèles, réseau de recherche 2RSHS, SciencesConf, 12 Juin.
Bailly, G. (2024) Controlling and Probing Generative End-to-end Models: New Opportunities for Research on Prosody, SPROSIG lecture series, Youtube, 27 November.
Bailly, G. (2024) End-to-end models as a proxy to probe massive data, CSTR, Edinburgh, UK, 2 April.
Bailly, G. (2023) End-to-end models as a proxy to probe massive data, Oriental COCOSDA, Delhi, India, 12 December.
Bailly, G. (2022) Exploring latent spaces of end-to-end text-to-speech synthesis systems, International Conference on Speech and Computer (SPECOM), Dharwad, India, 14 November.
Bailly, G. (2021) Training social robots: the devil is in the details, Furhat webinar, 14 October.
Bailly, G. (2021) Characterizing and assessing the oral reading fluency of young readers, Iberspeech, Valladolid, Spain, 24 March
Bailly, G. (2020) Characterizing and assessing the reading fluency of young readers, Séminaire du groupe GETALP, LIG, Grenoble, France.
Bailly, G. (2019) Apprendre des comportements sociaux à un robot humanoïde par téléopération immersive, Inauguration Creativ’Lab CPS/Robotique du LORIA, Nancy, France.
Bailly, G. (2019) Téléopération immersive de robots humanoïdes, Journées Nationales de la Recherche en Robotique (JNRR), Vittel, France.
Elisei F. & Bailly, G. (2018) Téléoperation immersive d'un robot humanoïde pour la collecte et la modélisation de corpus d'interactions homme-robot, Journées Neuroscience et multi-modalité dans les interactions humain-humain, humain-agent ou humain-robot, Paris, France.
Bailly, G. (2018) Demonstrating, learning & evaluating socio-communicative behaviors for HRI, IIT Guwahati , Guwahati, India.
Bailly, G. (2018) Demonstrating, learning & evaluating socio-communicative behaviors for HAI, Atelier "Explorer les interactions sociales conversationnelles avec des agents artificiels", Journées d'Etudes sur la Parole (JEP), Aix-en-Provence, France.
Bailly, G. (2018) Demonstrating, learning & evaluating socio-communicative behaviors for HRI, Workshop «Emotionally Intelligent Social Robots» organised by labcom Behaviors.AI (joint lab between LIRIS & Hoomano) and sponsored by ARC6 , Lyon, France.
Bailly, G. (2017) Demonstrating, learning & evaluating socio-communicative behaviors for HRI, Joint meeting of GT5/GDR Robotique & GT ACAI/GDR ISIS, Paris, France.
Bailly, G., F. Elisei & D.A. Nguyen (2017) Social robots for diagnosis and rehabilitation, Humanitas Clinical and Research Center, Milan, Italy.
Bailly, G., F. Elisei & D.A. Nguyen (2016) Providing a humanoid robot with socio-communicative skills: the SOMBRERO approach. Xerox Research Centre Europe (XRCE) Seminar, Meylan, France.
Bailly, G. & M. Garnier (2016) Phonetic convergence: data, mechanisms & motivations. Experimental Linguistics, St Petersburg, Russia.
Bailly, G. (2016) Prosodic modelling. Experimental Linguistics, St Petersburg, Russia.
Bailly G. (2015) Round table on the acceptability of robotic technologies. Entretiens Jacques Cartier ROBOTIQUE, SERVICES ET SANTÉ, Lyon, France
Bailly G., A. Mihoub, C. Wolf & F. Elisei (2015) Demonstrating & learning interactive multimodal behaviors for humanoid robots. French-Japanese-German Workshop on Human-Centric Robotics, Munich, Germany.
Bailly G., A. Mihoub, C. Wolf & F. Elisei (2015) Learning joint multimodal behaviors for face-to-face interaction: performance & properties of statistical models. Human-Robot Interaction (HRI). Workshop on behavior coordination between animals, humans and robots, Portland, OR.
Bailly G. (2014) Convergence phonétique, Journée scientifique annuelle "Langage & Cognition" du Pôle Cognition (GPC), Grenoble - France.
Bailly G. (2014) Apprentissage de modèles de comportements multimodaux, Journée scientifique "Robotique interactionnelle" de l'ARC5, Grenoble - France.
Bailly G. (2014) Apprentissage de modèles de comportements multimodaux, Ecole de printemps "Robotique et Interactions Sociales", Moliets et Maa - France.
Bailly G. (2013) L’interaction verbale et non-verbale, Journées Nationales de la Recherche en Robotique (JNRR), Annecy - France.
Bailly G. (2013) Parole et expressions faciales, Journée LIMA/GdR ISIS sur l'émotion, Lyon -France.
Bailly G. (2013) Interaction homme-agent conversationnel et génération de comportements verbaux et co-verbaux. Journée de l'Intelligence embarquée. Université de Cergy-Pontoise, Cergy - France.
Bailly G. (2012) Gaze, speech & face-to-face interaction. IIT/RBCS seminar, Genova - Italy.
Bailly G. (2011) Mutual attention and accommodation during face-to-face interaction. UWS Summer Research Festival, Researching Communication (ResCom) at UWS: Brain, Behaviour and Computation, Sydney - Australia.
Bailly G. & A. Lelong (2010). Phonetic convergence: overview and preliminary results. 3rd COST 2102 International Training School: "Toward Autonomous, Adaptive, and Context-Aware Multimodal Interfaces: Theoretical and Practical Issues", Caserta - Italy.
Bailly G. (2010) Verbal interaction, mutual attention & phonetic convergence, Séminaire LPP, Paris.
Bailly G. (2009). Situated face-to-face communication with avatars. 2nd COST 2102 International Training School: "Development of Multimodal Interfaces: Active Listening and Synchrony", Dublin - Ireland.
Bailly G., F. Elisei, S. Raidt, A. Casari & A. Picot (2006). Embodied conversational agents: computing and rendering realistic gaze patterns. Pacific Rim Conference on Multimedia Processing, Hangzhou - China.
Bailly G., N. Campbell & B. Mobius (2003). ISCA Special Session: Hot Topics in Speech Synthesis. EuroSpeech, Geneva - Switzerland.
Bailly G. (2002). Audiovisual speech synthesis. From ground truth to models. International Conference on Speech and Language Processing, Boulder - USA.
Bailly G. (2001). Audiovisual speech synthesis. ETRW on Speech Synthesis, Perthshire - Scotland.
Bailly G. (1996) Sensori-motor control of speech movements. ETRW on Speech Production Modelling: from Control Strategies to acoustics, Autrans - France.
Bailly G. (1990). Robotics in speech production:motor control theory. ETRW Workshop on Speech Synthesis, Autrans - France.

International conferences with review process

2025

Bailly, G., E. André, E. Cooper, B. Cowan, J. Edlund, N. Harte, S. King, E. Klabbers, S. Le Maguer, Z. Malisz, R. K. Moore, B. Möbius, S. Möller, A. Pandey, O. Perrotin, F. Seebauer, S. Strömbergsson, D. R. Traum, C. Tånnander, P. Wagner, J. Yamagishi & Y. Yasuda (submitted) Hot topics in speech synthesis evaluation, Speech Synthesis Workshop (SSW), Leeuwarden, NL.
Sankar S., M. Lenglet, G. Bailly, D. Beautemps, & T. Hueber (2025) Cued Speech Generation Leveraging a Pre-trained Audiovisual Text-to-Speech Model. ICASSP, Hyderabad, India. (HAL)

2024

Lenglet M., O. Perrotin & G. Bailly (2024) FastLips: an End-to-End Audiovisual Text-to-Speech System with Lip Features Prediction for Virtual Avatars. Interspeech, Kos, Greece, pp. 3450-3454. DOI. (HAL)
Charuau D., A. Briglia, E. Godde & G. Bailly (2024) Training speech-breathing coordination in computer-assisted reading. Interspeech, Kos, Greece. pp. 5128-5132. (DOI, HAL)
Elisei F., L. Haefflinger & G. Bailly (2024) RoboTrio: an annotated multimodal corpus of the interactions of two Humans with a teleoperated robot, Communication in Human-AI Interaction (CHAI), Malmö, Sweden.
Bailly G., R. Legrand, M. Lenglet, F. Elisei, M. Garnier & O. Perrotin (2024) Emotags: Computer-assisted verbal labelling of expressive audiovisual utterances for expressive multimodal TTS, Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING), Torino, Italy. (HAL)
Charuau D., A. Briglia, E. Godde & G. Bailly (2024) Entraînement de la coordination respiration-parole en apprentissage de la lecture assistée par ordinateur. Journées d'Etudes sur la Parole, pp.351-360, Toulouse, France. (DOI, HAL)
Godde E., M.-L. Bosse & G. Bailly (2024) A reading karaoke to improve reading rate, reading prosody and comprehension, 31st Annual Conference Society for the Scientific Study of Reading (SSSR), Copenhague, Denmark. (HAL)
Briglia A., E. Godde, D. Charuau, M.-L. Bosse & G. Bailly (2024) A karaoke-based game to improve the ability of French primary school pupils in planning pauses and breathing while reading aloud, 31st Annual Conference Society for the Scientific Study of Reading (SSSR), Copenhague, Denmark. (HAL)
Younès, R., F. Elisei, D. Pellier & G. Bailly (2024) Impact of verbal instructions and deictic gestures of a cobot on the performance of human coworkers, IEEE-RAS International Conference on Humanoid Robots (Humanoids), pp. 1025--1032, Nancy, France. (HAL)
Ringeval, F., B. Schuller, G. Bailly, S. Azzakhnini & H. Fournier (2024) EVAC 2024 — Empathic Virtual Agent Challenge: Appraisal-based Recognition of Affective States, International Conference on Multimodal Interaction (ICMI), pp. 677--683, Costa Rica. DOI.

2023

Lenglet M., O. Perrotin & G. Bailly (2023) The GIPSA-Lab Text-To-Speech System for the Blizzard Challenge 2023. 18th Blizzard Challenge Workshop, pp. 34-39, Grenoble, France, DOI. (HAL)
Perrotin O., B. Stephenson, S. Gerber & G. Bailly (2023) The Blizzard Challenge 2023. 18th Blizzard Challenge Workshop, Grenoble, France. pp. 1-27, DOI. (HAL)
Bailly G., M. Lenglet, O. Perrotin & E. Klabbers (2023) Advocating for text input in multi-speaker text-to-speech systems, Speech Synthesis Workshop (SSW), pp. 1-7, Grenoble, France. DOI. (HAL)
Lenglet M., O. Perrotin & G. Bailly (2023) Local Style Tokens: Fine-Grained Prosodic Representations for TTS Expressive Control, Speech Synthesis Workshop (SSW), pp. 120-126, Grenoble, France. DOI. (HAL)
Haefflinger L., F. Elisei, S. Gerber, B. Bouchot, J.-P. Vigne & G. Bailly (2023) Data-driven Generation of Eyes and Head Movements of a Social Robot in Multiparty Conversation, International Conference on Social Robotics (ICSR), pp. 191-203, Doha, Quatar. DOI. (HAL)
Haefflinger L., F. Elisei, B. Bouchot, J.-P. Vigne & G. Bailly (2023) On the benefit of independent control of head and eye movements of a social robot for multiparty human-robot interaction, HCII, pp. 450-466, Copenhagen, Denmark. DOI. (HAL)

2022

Hajj M.-L., M. Lenglet, O. Perrotin & G. Bailly (2022) Comparing NLP solutions for the disambiguation of French heterophonic homographs for end-to-end TTS systems, Speech and Computer (SpeCom), pp. 265–278, Gurugram, India, In: Prasanna, S.R.M., Karpov, A., Samudravijaya, K., Agrawal, S.S. (eds) Lecture Notes in Computer Science, vol 13721. Springer. (DOI. HAL)
Lenglet M., O. Perrotin & G. Bailly (2022) Speaking Rate Control of end-to-end TTS Models by Direct Manipulation of the Encoder's Output Embeddings, Interspeech, pp. 11-15, Seoul, South Korea. (DOI, HAL)
Younes, R., G. Bailly, F. Elisei & D. Pellier (2022) Automatic Verbal Depiction of a Brick Assembly for a Robot Instructing Humans, SIGDIAL, pp. 159-171, Edimburgh, UK. (DOI, HAL)
Lenglet M., O. Perrotin & G. Bailly (2022) Modélisation de la Parole avec Tacotron2: Analyse acoustique et phonétique des plongements de caractère, Journées d'Etudes sur la Parole (JEP), pp. 845-854, Normoutiers, France. (DOI, HAL)

2021

Koelsch L., F. Elisei, L. Ferrand, P. Chausse, G. Bailly & P. Huguet (2021) Impact of social presence of humanoid robots: does competence matter? International Conference on Social Robotics (ICSR), pp. 729-739, Singapore. (HAL) [BEST PAPER AWARD]
Perrotin O., A., Hussein, G. Bailly & T. Hueber (2021) Evaluating the extrapolation capabilities of neural vocoders to extreme pitch values, Interspeech, pp. 11-15, Brno, Czech Republic. DOI. (HAL)
Tarpin-Bernard F., J. Fruitet, J.P. Vigne, P. Constant, H. Chainay, O. Koenig, F. Ringeval, B. Bouchot, G. Bailly, F. Portet, S. Alisamir, Y. Zhou, J. Serre, V. Delerue, H. Fournier, K. Berenger, I. Zsoldos, O. Perrotin, F. Elisei, M. Lenglet, C. Puaux, L. Pacheco, M. Fouillen & D. Ghenassia (2021) THERADIA: digital therapies augmented by Artificial Intelligence, International Conference on Applied Human Factors and Ergonomics (AHFE), pp. 478–485, New-York, USA. ()DOI. HAL)
Godde E., M.-L. Bosse & G. Bailly (2021) Causal links between comprehension and fluency dimensions including prosody from grade 2 to 4, Society for the Scientific Study of Reading Conference, Lancaster, UK.
Mandin S., A. Zaher, S. Meyer, M. Loiseau, G. Bailly, C. Payre-Ficout, J. Diard, Fluence-Group & S. Valdois (2021) Expérimentation à grande échelle d'applications pour tablettes pour favoriser l'apprentissage de la lecture et de l'anglais, Conférence sur les Environnements Informatiques pour l'Apprentissage Humain (EIAH), Friburg, Switzerland. (HAL)
Godde E., G. Bailly, A.-L. Piat-Marchand & M.-L. Bosse (2021) Suivi longitudinal de la fluence en lecture par évaluation automatique, Conférence sur les Environnements Informatiques pour l'Apprentissage Humain (EIAH), Friburg, Switzerland. (HAL)
Lenglet M., O. Perrotin & G. Bailly (2021) Impact of segmentation and annotation in French end-to-end synthesis, Speech Synthesis Workshop (SSW), pp. 13--18, Budapest, Hungary. DOI. (HAL)

2020

Bailly, G., E. Godde, A.-L. Piat-Marchand & M.-L. Bosse (2020) Predicting multidimensional subjective ratings of children' readings from the speech signals for the automatic assessment of fluency, International Conference on Language Resources and Evaluation (LREC), pp. 317-322, Marseille, France. (HAL)
Bailly, G. & F. Elisei (2020) Speech in action: designing challenges that require incremental processing of self and others' speech and performative gestures, Workshop on Natural Language Generation for Human-Robot Interaction at Human-Robot Interaction (NLG4HRI 2020), Dublin (virtual), Ireland. (HAL)

2019

Mohammed, O., G. Bailly & D. Pellier (2019) Style transfer and extraction for the handwritten letters using deep learning,International Conference on Agents and Artificial Intelligence (ICAART), Prague, Czech Republic, pp. 677-684. (HAL)
Godde E., G. Bailly & M.-L. Bosse (2019) Reading prosody development: automatic assessment for a longitudinal study, Speech & Language Technology for Education (SLaTE), pp. 104--108, Graz, Austria. (DOI, HAL)

2018

Bailly, G. & F. Elisei (2018) Demonstrating and learning multimodal socio-communicative behaviors for HRI: building interactive models from immersive teleoperation data, AI-MHRI: AI for Multimodal Human Robot Interaction Workshop at the Federated AI Meeting (FAIM), pp. 39-43, Stockholm - Sweden. (DOI. HAL)
Nguyen, D.-C., G. Bailly & F. Elisei (2018) Comparing cascaded LSTM architectures for generating gaze-aware head motion from speech in HAI task-oriented dialogs, HCI International, pp. 164-175, Las Vegas, USA:. (DOI. HAL)
Cambuzat, R., Elisei, F., Bailly, G., Simonin, O. & Spalanzani, A. (2018) Immersive teleoperation of the eye gaze of social robots, International Symposium on Robotics (ISR), Munich, Germany: pp. 232-239. (HAL)
Gerazov, B. & G. Bailly (2018) PySFC - A system for prosody analysis based on the Superposition of Functional Contours prosody model, Speech Prosody, Poznań, Poland: pp. 774-778. (DOI.HAL)
Gerazov, B., G. Bailly & Y. Xu (2018) The significance of scope in modelling tones in Chinese, International Symposium on Tonal Aspects of Languages (TAL), Berlin, Germany: pp. 183-187. DOI. (HAL)
Gerazov, B., G. Bailly & Y. Xu (2018) A Weighted Superposition of Functional Contours model for modelling contextual prominence of elementary prosodic contours, Interspeech, Hyderabad, India: pp. 2524-2528. DOI. (HAL)
Gerazov, B., G. Bailly, O. Mohammed & Y. Xu (2018) Embedding Context-Dependent Variations of Prosodic Contours using Variational Encoding for Decomposing the Structure of Speech Prosody, Workshop on Prosody and Meaning: Information Structure and Beyond, Aix-en-Provence, France. (ARXIV)
Mohammed, O., G. Bailly and D. Pellier (2018) Handwriting styles: benchmarks and evaluation metrics, First International Workshop on Deep and Transfer Learning at the International Conference on Social Networks Analysis, Management and Security (SNAMS), Valencia, Spain: pp 159-166. DOI. (HAL)
Nguyen, V. Q., L. Girin, G. Bailly, F. Elisei & D.-C. Nguyen (2018) Autonomous sensorimotor learning for sound source localization by a humanoid robot, Workshop on Crossmodal Learning for Intelligent Robotics in conjunction with IEEE/RSJ IROS 2018, Madrid, Spain. (HAL)
Gerazov, B., G. Bailly, O. Mohammed, Y. Xu & P. Garner (2018) A Variational Prosody Model for the decomposition and synthesis of speech prosody, Conference on Neural Information Processing Systems (NIPS), Montréal, Canada: submitted. (ARXIV)

2017

Mohammed O., G. Bailly & D. Pellier (2017) Acquiring human-robot interaction skills with transfer learning techniques, Human-Robot Interaction (HRI) Pionner Workshop, Vienna, Austria: pp. 359-360. (HAL)
Godde E., G. Bailly, D. Escudero, M.-L. Bosse, M. Bianco & C. Vilain (2017) Improving fluency of young readers: introducing a Karaoke to learn how to breath during a Reading-while-Listening task, Speech & Language Technology for Education (SLaTE), Stockholm, Sweden: pp. 127-131. DOI. (HAL)
Nguyen, D.-C., G. Bailly & F. Elisei (2017) An evaluation framework to assess and correct the multimodal behavior of a humanoid robot in human-robot interaction, Gesture in Interaction (GESPIN), Posznan, Poland: pp. 56-62. (HAL)
Cambuzat R., G. Bailly & F. Elisei (2017) Gaze contingent control of vergence, yaw and pitch of robotic eyes for immersive telepresence, European Conf. on Eye Movements (ECEM), Wuppertal, Germany. (HAL)
Godde E., G. Bailly, D. Escudero, M.-L. Bosse & E. Gillet-Perret (2017) Evaluation of reading performance of primary school children: objective measurements vs. subjective ratings, Workshop on Child Computer Interaction (WOCCI), Glasgow, Scotland: pp. 23-27. DOI. (HAL)

2016

Pouget M., O. Nahorna, T. Hueber & G. Bailly (2016) Adaptive latency for part-of-speech tagging in incremental text-to-speech synthesis. Interspeech, San Francisco, CA: pp. 2846-2850. DOI. (HAL)
Bailly G., F. Elisei, A. Juphard & O. Moreau (2016) Quantitative analysis of backchannels uttered by an interviewer during neuropsychological tests. Interspeech, San Francisco, CA: pp. 2905-2909. DOI. (HAL)
Barbulescu A., R. Ronfard & G. Bailly (2016) Characterization of audiovisual dramatic attitudes. Interspeech, San Francisco, CA: 585-589. DOI. (HAL)
Nguyen, D.-C., G. Bailly & F. Elisei (2016) Conducting neuropsychological tests with a humanoid robot: design and evaluation. IEEE Int. Conf. on Cognitive Infocommunications (CogInfoCom), Wroclaw, Poland, pp. 337-342. (HAL) [BEST PAPER AWARD]

2015

Foerster F., G. Bailly & F. Elisei (2015) Impact of iris size and eyelids coupling on the estimation of the gaze direction of a robotic talking head by human viewers. Humanoids, Seoul, Korea: pp.148-153. (HAL)
Pouget, M., T. Hueber, G. Bailly and T. Bauman (2015). HMM training strategy for incremental speech synthesis. Interspeech, Dresden, Germany: pp.1201-1205. (HAL)
Gerbier, E., G. Bailly & M.-L. Bosse (2015). Using Karaoke to enhance reading while listening: impact on word memorization and eye movements. Workshop on Speech and Language Technology for Education (SLaTE), Leipzig, Germany: pp.59-64. (HAL)
Bailly G., F. Elisei & M. Sauze (2015). Beaming the gaze of a humanoid robot. Human-Robot Interaction (HRI) Late Breaking Reports, Portland, OR: pp.47-49. (HAL)
Bailly G., A. Mihoub, C. Wolf & F. Elisei (2015). Learning joint multimodal behaviors for face-to-face interaction: performance & properties of statistical models. Human-Robot Interaction (HRI). Workshop on behavior coordination between animals, humans and robots, Portland, OR. (HAL)
Barbulescu, A., G. Bailly, R. Ronfard & Maël Pouget (2015) Audiovisual generation of social attitudes from neutral stimuli, Auditory-Visual Speech Processing (AVSP), Vienna, Austria: pp.34-39. (HAL)
Guillermo G., C. Plasson, F. Elisei, F. Noël & G. Bailly (2015) Qualitative assesment of a beaming environment for
collaborative professional activities, European conference for Virtual Reality and Augmented Reality (EuroVR), Milano, Italy. (HAL)

2014

Mihoub, A., G. Bailly & C. Wolf (2014). Modelling perception-action loops: comparing sequential models with frame-based classifiers. Human-Agent Interaction (HAI), Tsukuba, Japan: pp.309-314. (HAL)
Barbulescu, A., R. Ronfard, G. Bailly G. Gagneré & H. Cakmak (2014). Beyond basic emotions: expressive virtual actors with social attitudes. ACM/SIGGRAPH Conference on Motion in Games (MIG), Los Angeles, CA: pp.39-47. (HAL)
Parmiggiani, A., M. Randazzo, M. Maggiali, F. Elisei, G. Bailly & G. Metta (2014). An articulated talking face for the iCub. Humanoids, Madrid: pp.1-6. (HAL)
Bailly G. & A. Martin (2014). Assessing objective characterizations of phonetic convergence. Interspeech, Singapour: pp.2011-2015. (HAL)
Rochet-Capellan, A., G. Bailly & S. Fuchs (2014). Is breathing sensitive to the communication partner? Speech Prosody, Dublin: pp.613-618. (HAL)

2013

Bailly G., A. Rochet-Capellan & C. Vilain (2013). Adaptation of respiratory patterns in collaborative reading. Interspeech, Lyon - France: pp.1653-1657. (HAL)
Hueber, T., G. Bailly, P. Badin & F. Elisei (2013). Speaker adaptation of an acoustic-articulatory inversion model using cascaded Gaussian mixture regressions. Interspeech, Lyon - France: pp.2753-2757. (HAL)
Barbulescu, A., T. Hueber, G. Bailly & R. Ronfard (2013). Audiovisual speaker conversion using prosodic features. Auditory-Visual Speech Processing (AVSP), St Jorioz - France: pp.11-16. (HAL)
Mihoub, A., G. Bailly & C. Wolf (2013). Social behavior modeling based on Incremental Discrete Hidden Markov Models. International Workshop on Human Behavior Understanding (HBU), Barcelona - Spain: pp.172-183. (HAL)

2012

Bailly G. & C. Gouvernayre (2012). Pauses and respiratory markers of the structure of book reading. Interspeech. Portland, OR: pp.2218-2221. (HAL)
Hueber, T., A. Ben Youssef, G. Bailly, P. Badin & F. Elisei (2012). Cross-speaker acoustic-to-articulatory inversion using phone-based trajectory HMM for pronunciation training. Interspeech. Portland, OR: pp.783-786. (HAL)
Hueber, T., G. Bailly & B. Denby (2012). Continuous articulatory-to-acoustic mapping using phone-based trajectory HMM for a silent speech interface. Interspeech. Portland: pp.723-726. (HAL)
Lelong, A. & G. Bailly (2012). Original objective and subjective characterization of phonetic convergence. International Symposium on Imitation and Convergence in Speech. Aix-en-Provence, France. (HAL)
Lelong, A. & G. Bailly (2012). Characterizing phonetic convergence with speaker recognition techniques. Listening Talker Workshop. Edinburgh, UK: pp.28-31. (HAL; .pdf)
Hueber, T., A. Ben Youssef, P. Badin, G. Bailly & F. Elisei (2012). Vizart3D : retour articulatoire visuel pour l’aide à la prononciation. Journées d'Etudes sur la Parole (JEP). Grenoble - France: pp.17-18. (HAL)

2011

Ben Youssef, A., T. Hueber, P. Badin & G. Bailly (2011). Toward a multi-speaker visual articulatory feedback system. Interspeech. Florence: pp.589-592. (HAL; .pdf)
Bailly G. & W. Barbour (2011). Synchronous reading: learning French orthography by audiovisual training. Interspeech. Florence: pp.1153-1156. (HAL; .pdf)
Ben Youssef, A, P. Badin & G. Bailly (2011). Improvement of HMM-based acoustic-to-articulatory speech inversion. International Seminar on Speech Production (ISSP). Montréal, CA. (HAL)
Hueber, T., P. Badin, C. Savariaux, C. Vilain & G. Bailly (2011). Differences in articulatory strategies between silent, whispered and normal speech? A pilot study using electromagnetic articulography. International Seminar on Speech Production (ISSP). Montréal, CA. (HAL; .pdf)
Ben Youssef, A., T. Hueber, P. Badin, G. Bailly & F. Elisei (2011). Toward a speaker-independent visual articulatory feedback system. International Seminar on Speech Production (ISSP). Montréal. (HAL)
Hueber, T., P. Badin, G. Bailly, A. Ben Youssef, F. Elisei, B. Denby & G. Chollet (2011). Statistical mapping between articulatory and acoustic features. Application to silent speech Interface and visual articulatory feedback. International Workshop on Performative Speech and Singing Synthesis. Vancouver. (HAL; .pdf).
Hueber, T., A. Ben youssef, P. Badin, G. Bailly & F. Elisei (2011). Articulatory-to-acoustic mapping: application to silent speech interface and visual articulatory feedback. Pan European Voice Conference. Marseille, France. (HAL; .pdf)

2010

Fagel, S. & G. Bailly (2010). On the importance of eye gaze in a face-to-face collaborative task. ACM Workshop on Affective Interaction in Natural Environments (AFFINE). Firenze, Italy, p. 81-85. (HAL; .pdf).
Boucher, J.-D., J. Ventre-Dominey, P. F. Dominey, G. Bailly & S. Fagel (2010). Facilitative effects of communicative gaze and speech in human-robot cooperation. ACM Workshop on Affective Interaction in Natural Environments (AFFINE). Firenze, Italy, p. 71-74. (HAL; .pdf).
Badin, P., A. Ben Youssef, G. Bailly, F. Elisei & T. Hueber (2010). Visual articulatory feedback for phonetic correction in second language learning. Second Language Studies: Acquisition, Learning, Education and Technology. Tokyo. (HAL; .pdf).
Ben Youssef, A., P. Badin and G. Bailly (2010). Acoustic-to-articulatory inversion in speech based on statistical models. Auditory-Visual Speech Processing (AVSP). Hakone, Japan, p. 160-165. (HAL; .pdf).
Bailly G. & A. Lelong (2010). Speech dominoes and phonetic convergence. Interspeech. Tokyo, p.1153-1156. (HAL; .pdf).
Ben Youssef, A., P. Badin & G. Bailly (2010). Face-to-tongue articulatory inversion based on statistical models. Interspeech. Tokyo, p. 2002-2005. (HAL; .pdf).
Heracleous, P., P. Badin, G. Bailly & N. Hagita (2010). Robust speech recognition based on multimodal fusion. International Conference on Multimedia & Expo (ICME). Singapore, p.568-572. (.pdf).
Ben Youssef, A., P. Badin & G. Bailly (2010). Méthodes basées sur les HMMs et les GMMs pour l'inversion acoustico-articulatoire en parole. Journées d'Etudes sur la Parole. Mons, Belgium, p.249-252. (.pdf).

2009

Ben Youssef, A., P. Badin, G. Bailly & P. Heracleous (2009). Acoustic-to-articulatory inversion using speech recognition and trajectory formation based on phoneme hidden Markov models. Interspeech. Brighton, p.2255-2258. (.pdf).
Tran, V.-A., G. Bailly, H. Loevenbruck & T. Toda (2009). Multimodal HMM-based NAM-to-speech conversion. Interspeech. Brighton, p.656-659. (.pdf).

2008

Bailly G., A. Bégault, F. Elisei & P. Badin (2008). Speaking with smile or disgust: data and models. Auditory-Visual Speech Processing (AVSP). Tangalooma, Australia, p.111-116. (.pdf).
Bailly G., Y. Fang, F. Elisei & D. Beautemps (2008). Retargeting cued speech hand gestures for different talking heads and speakers. Auditory-Visual Speech Processing (AVSP). Tangalooma, Australia, p.153-158. (.pdf).
Fagel, S. & G. Bailly (2008). From 3-D speaker cloning to text-to-audiovisual speech. Auditory-Visual Speech Processing (AVSP), Tangalooma - Australia, p.43-46. (.pdf).
Bailly G., O. Govokhina, G. Breton, F. Elisei & C. Savariaux (2008). The trainable trajectory formation model TD-HMM parameterized for the LIPS 2008 challenge. Interspeech. Brisbane, Australia, p.2318-2321. (.pdf).
Theobald, B.-J., S. Fagel, G. Bailly & F. Elisei (2008). LIPS2008: Visual speech synthesis challenge. Interspeech, Brisbane, Australia, p.2310-2313. (.pdf).
Fagel, S., F. Elisei & G. Bailly (2008). From 3-D speaker cloning to text-to-audiovisual-speech. Interspeech, Brisbane, Australia, p.2325. (.pdf).
Badin, P., Y. Tarabalka, F. Elisei & G. Bailly (2008). Can you “read tongue movements”? Interspeech, Brisbane, Australia, p.2635-2637. (HAL;.pdf).
Tran, V.-A., G. Bailly, H. Loevenbruck & C. Jutten (2008). Improvement to a NAM captured whisper-to-speech system. Interspeech. Brisbane, Australia, p.1465-1468. (.pdf).
Tran, V.-A., G. Bailly, H. Lovenbrück & C. Jutten (2008). Amélioration de système de conversion de voix inaudible vers la voix audible. Journées d'Etude sur la Parole (JEP). Avignon, France. (.pdf).
Tran, V.-A., G. Bailly, H. Loevenbruck & T. Toda (2008). Predicting F0 and voicing from NAM-captured whispered speech. Speech Prosody. Campinas, Brazil, p.107-110. (.pdf).
Bailly G. & A. Bartroli (2008). Generating Spanish intonation with a trainable prosodic model. Speech Prosody. Campinas - Brazil. p.63-66. (.pdf).
Bérar, M., M. Desvignes & G. Bailly (2008). Reconstruction faciale 3D à partir d’images 3D. RFIA. Amiens, France.

2007

Raidt, S., G. Bailly & F. Elisei (2007). Gaze patterns during face-to-face interaction.IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology - Workshop on Communication between Human and Artificial Agents (CHAA). Fremont, CA. p.338-341. (.pdf)
Elisei, F., G. Bailly & A. Casari (2007). Towards eyegaze-aware analysis and synthesis of audiovisual speech. Auditory-visual Speech Processing, Hilvarenbeek, The Netherlands. p.120-125. (HAL;.pdf)
Fagel, S., G. Bailly & F. Elisei (2007). Intelligibility of natural and 3D-cloned German speech. Auditory-visual Speech Processing, Hilvarenbeek, The Netherlands. p.126-131. (.pdf)
Raidt, S., G. Bailly & F. Elisei (2007). Mutual gaze during face-to-face interaction. Auditory-visual Speech Processing, Hilvarenbeek, The Netherlands. (.pdf)
Raidt, S., G. Bailly & F. Elisei (2007). Analyzing and modeling gaze during face-to-face interaction. 7th International Conference on Intelligent Virtual Agents, Paris. p.403-404. ( .pdf)
Raidt S., G. Bailly & F. Elisei (2007). Impact of cognitive state on gaze patterns during face-to-face interaction. 14th European Conference on Eye Movements, Potsdam, Germany.
Picot, A., G. Bailly, F. Elisei & S. Raidt (2007). Scrutinizing natural scenes: controlling the gaze of an embodied conversational agent. 7th International Conference on Intelligent Virtual Agents, Paris. p.272-282.(.pdf)
Govokhina, O., G. Bailly & G. Breton (2007). Learning optimal audiovisual phasing for a HMM-based control model for facial animation. ISCA Speech Synthesis Workshop. Bonn, Germany.( .pdf)
Tarabalka, Y., P. Badin, F. Elisei & G. Bailly (2007). Peut-on lire sur la langue? Évaluation de l'apport de la vision de la langue à la compréhension de la parole. Journées de Phonétique Clinique. Grenoble, France.
Tarabalka, Y., P. Badin, F. Elisei & G. Bailly (2007). Can you “read tongue movements”? Evaluation of the contribution of tongue display to speech understanding. Conférence Internationale sur l'Accessibilité et les systèmes de suppléance aux personnes en situation de Handicaps (ASSISTH), Toulouse - France. (HAL;.pdf)
Beautemps, D., L. Girin, N. Aboutabit, G. Bailly, L. Besacier, G. Breton, T. Burger, A. Caplier, M.-A. Cathiard, D. Chêne, J. Clarke, F. Elisei, O. Govokhina, M. Marthouret, S. Mancini, Y. Mathieu, P. Perret, B. Rivet, P. Sacher, C. Savariaux, S. Schmerber, J.-F. Sérignat, M. Tribout & S. Vidal (2007). TELMA : Téléphonie à l'usage des malentendants. Des modèles aux tests d'usage. Conférence Internationale sur l'Accessibilité et les systèmes de suppléance aux personnes en situation de Handicaps (ASSISTH), Toulouse - France. (HAL;.pdf).

2006

Govokhina, O., G. Bailly G. Breton & P. Bagshaw (2006) A new trainable trajectory formation system for facial animation. ISCA Workshop on Experimental Linguistics. Athens, Greece: 25-32. (.pdf)
Gibert, G., G. Bailly & F. Elisei (2006) Evaluation of a virtual speech cuer. ISCA Workshop on Experimental Linguistics. Athens, Greece: 141-144. (.pdf)
Bailly G., F. Elisei, S. Raidt, A. Casari & A. Picot (2006). Embodied conversational agents : computing and rendering realistic gaze patterns. LNCS 4261: Pacific Rim Conference on Multimedia Processing, Hangzhou, China, 9-18. (HAL; .pdf)
Govokhina, O., G. Bailly G. Breton & P. Bagshaw (2006). TDA: A new trainable trajectory formation system for facial animation. InterSpeech, Pittsburgh, PE: 2474-2477. (.pdf)
Bailly G. & I. Gorisch (2006). Generating German intonation with a trainable prosodic model. InterSpeech, Pittsburgh, PE: 2366-2369. (.pdf)
Gacon, P., P. Y. Coulon & G. Bailly (2006). Audiovisual speech enhancement experiments for mouth segmentation evaluation. EUSIPCO, Pisa, Italy. (.pdf)
Gibert, G., G. Bailly & F. Elisei (2006). Evaluating a virtual speech cuer. InterSpeech, Pittsburgh, PE: 2430-2433. (.pdf)
Bailly G., Baras, C., Bas, P., Baudry, S., Beautemps, D., Brun, R., Chassery, J.-M., Davoine, F., Elisei, F., Gibert, G., Girin, L., Grison, D., Léoni, J.-P., Liénard, J., Moreau, N., and Nguyen, P. (2006) ARTUS : calcul et tatouage audiovisuel des mouvements d'un personnage animé virtuel pour l'accessibilité d'émissions télévisuelles aux téléspectateurs sourds comprenant la Langue Française Parlée Complétée. Handicap. Paris, 265-270. (.pdf)
Bailly G., F. Elisei, P. Badin & C. Savariaux (2006). Degrees of freedom of facial movements in face-to-face conversational speech. International Workshop on Multimodal Corpora, Genoa - Italy: 33-36. (.pdf)
Boula de Mareüil, P., C. d'Alessandro, A. Raake, G. Bailly, M.-N. Garcia & M. Morel (2006). A joint intelligibility evaluation of French text-to-speech systems: the EvaSy SUS/ACR campaign. Language Ressources and Evaluation Conference (LREC), Genova - Italy: 2034-2037. (.pdf)
Garcia, M.-N., C. d’Alessandro, P. Boula de Mareüil, A. Raake, G. Bailly & M. Morel (2006). A joint intelligibility evaluation of French text to speech systems: the EVASY/SUS campaign. Language Ressources and Evaluation Conference (LREC), Genova - Italy: 307-310. (.pdf)
Raidt, S., G. Bailly & F. Elisei (2006). Does a virtual talking face generate proper multimodal cues to draw user's attention towards interest points? Language Ressources and Evaluation Conference (LREC), Genova - Italy: 2544-2549.(.pdf)
Govokhina, O., G. Bailly G. Breton & P. Bagshaw (2006). Evaluation de systèmes de génération de mouvements faciaux. Journées d'Etudes sur la Parole, Rennes - France: 305-308. (.pdf)
Gibert, G., G. Bailly & F. Elisei (2006). Evaluation d'un système de synthèse 3D de Langue française Parlée Complétée. Journées d'Etudes sur la Parole, Rennes - France: 495-498. (.pdf)

2005

Raidt, S., F. Elisei & G. Bailly (2005) Face-to-face interaction with a conversationnal agent: eye-gaze and deixis. in International Conference on Autonomous Agents and Multiagent Systems. Utrecht University, The Netherlands. (.pdf)
Raidt, S., F. Elisei, F. & G. Bailly (2005) Basic components of a face-to-face interaction with a conversational agent: multual attention and deixis. in Smart Objects and Ambient Intelligence, Grenoble-France: 247-252. (.pdf)
Raidt, S., G. Bailly & F. Elisei (2005). Eye gaze in face-to-face interaction with a talking head. European Conference on Eye Movements, Bern, Switzerland.
Bailly G., F. Elisei & S. Raidt (2005) Multimodal face-to-face interaction with a talking face: mutual attention and deixis. in Human-Computer Interaction International. Las Vegas. (.pdf)
Bérar, M., M. Desvignes, G. Bailly & Y. Payan (2005). Statistical skull models from 3D X-ray images. in International Conference on Reconstruction of Soft Facial Parts, Remagen, Germany. (.pdf)
Bérar, M., M. Desvignes, M., G. Bailly & Y. Payan (2005) 3D statistical facial reconstruction. in International Symposium on Image and Signal Processing and Analysis. Zagreb, Croatia. (.pdf)
Bérar, M., M. Desvignes, M., G. Bailly & Y. Payan (2005) Missing data estimation using polynomial kernels. in International Conference on Advances in Pattern Recognition. Bath, UK. (.pdf)
Bérar, M., M. Desvignes, G. Bailly & Y. Payan (2005) Reconstruction par noyaux polynomiaux. in GRETSI. Louvain-la-neuve, Belgique. (.pdf)
Gacon, P., P. Y. Coulon & G. Bailly (2005) Statistical active model for mouth components segmentation. in International Conference on Acoustics, Speech, and Signal Processing. Philadelphia, PA: 1021-1024. (.pdf)
Gacon, P., P. Y. Coulon & G. Bailly (2005). Modèle statistique et description locale d'apparence pour la détection des contours des lèvres. GRETSI, Louvain-la-Neuve, Belgique. (.pdf)
Gacon, P., Coulon, P.Y. & G. Bailly (2005) Non-linear active model for mouth inner and outer contours detection. in EUSIPCO. Antalya, Turkey. (.pdf)
Elisei, F., G. Bailly G. Gibert & R. Brun (2005) Capturing data and realistic 3D models for cued speech analysis and audiovisual synthesis. in Auditory-Visual Speech Processing Workshop. Vancouver, Canada. (.pdf)
Boula de Mareüil, P., C. d'Alessandro, G. Bailly, F. Béchet, M.-N. Garcia, M. Morel, R. Prudon & J. Véronis (2005). Evaluation de la prononciation des noms propres par quatre convertisseurs graphème-phonème en français. Colloque Traitement lexicographique des noms propres, Tours - France.
Boula de Mareüil, P., C. d'Alessandro, G. Bailly, F. Béchet, M.-N. Garcia, M. Morel, R. Prudon & J. Véronis (2005). Evaluating the pronunciation of proper names by four French grapheme-to-phoneme converters. InterSpeech, Lisbon, Portugal: 1521-1524. (.pdf)

2004

Raidt, S., G. Bailly, B. Holm & H. Mixdorff (2004) Automatic generation of prosody: comparing two superpositional systems. in International Conference on Speech Prosody. Nara, Japan: 417-420. (.pdf)
Berar, M., M. Desvignes, G. Bailly & Y. Payan (2004) 3D Meshes Registration : Application to statistical skull model. in International Conference on Image Analysis and Recognition. Porto - Portugal: 100-107. .pdf)
Odisio, M. & G. Bailly (2004) Audiovisual perceptual evaluation of resynthesised speech movements. in International Conference on Speech and Language Processing. Jeju, Korea: 2029-2032. (.pdf)
Bailly G., Holm, B. & Aubergé, V. (2004) A trainable prosodic model: learning the contours implementing communicative functions within a superpositional model of intonation. International Conference on Speech and Language Processing. Jeju, Korea. p. 1425-1428. (.pdf)
Chen, G.-P., G. Bailly, Q.-F. Liu & R.-H. Wang, (2004) A superposed prosodic model for Chinese text-to-speech synthesis. in International Conference of Chinese Spoken Language Processing. p.177-180. (.pdf)
Gibert, G., G. Bailly, F. Elisei, D. Beautemps & R. Brun (2004). Audiovisual text-to-cued speech synthesis. Eusipco, Vienna - Austria: 1007-1010. (.pdf)
Gibert, G., G. Bailly & F. Elisei (2004). Audovisual text-to-cued speech synthesis. 5th Speech Synthesis Workshop, Pittsburgh, PA: 85-90. (.pdf)
Gibert, G., G. Bailly, F. Elisei, D. Beautemps & R. Brun (2004). Evaluation of a speech cuer: from motion capture to a concatenative text-to-cued speech system. Language Ressources and Evaluation Conference (LREC), Lisbon, Portugal: 2123-2126. (.pdf)
Gibert, G., G. Bailly, F. Elisei, D. Beautemps & R. Brun (2004). Mise en oeuvre d'un synthétiseur 3D de Langage Parlé Complété. Journées d'Etudes sur la Parole, Fès, Maroc: 245-248. (.pdf)
Gacon, P., P.-Y. Coulon & G. Bailly (2004). Shape and sampled-based appearance model for mouth components segmentation. International Workshop on Image Analysis for Multimedia Interactive Services, Lisbon. (.pdf)

2003

Odisio, M. & G. Bailly (2003). Shape and appearance models of talking faces for model-based tracking. Audio Visual Speech Processing, St Jorioz, France: 105-110. (.pdf)
Odisio, M. & G. Bailly (2003). Shape and appearance models of talking faces for model-based tracking. International Conference on Computer Vision, Nice - France: 143-148. (.pdf)
Bérar, M., G. Bailly, M. Chabanas, F. Elisei, M. Odisio & Y. Pahan (2003). Towards a generic talking head. 6th International Seminar on Speech Production, Sydney- Australia: 7-12. (.pdf)
Bailly G., N. Campbell & B. Mobius (2003). ISCA Special Session: Hot Topics in Speech Synthesis. EuroSpeech, Geneva, Switzerland: 37-40. (DOI)
Bailly G., F. Elisei, M. Odisio, D. Pelé & K. Grein-Cochard (2003). Objects and agents for MPEG-4 compliant scalable face-to-face telecommunication. Smart Object Conference, Grenoble - France: 204-207. (.pdf)
Badin, P., G. Bailly, F. Elisei & M. Odisio (2003). Virtual talking heads and audiovisual articulatory synthesis. International Congress on Phonetic Sciences, Barcelona: 193-197. (.pdf)

2002

Holm, B. & G. Bailly (2002). Learning the hidden structure of intonation: implementing various functions of prosody. Speech Prosody, Aix-en-Provence, France: 399-402. (.pdf)
Bailly G., G. Gibert & M. Odisio (2002). Evaluation of movement generation systems using the point-light technique. IEEE Workshop on Speech Synthesis, Santa Monica, CA: 27-30. (DOI)
Bailly G. (2002). Audiovisual speech synthesis. From ground truth to models. International Conference on Speech and Language Processing, Boulder - Colorado: 1453-1456. (DOI)
Bailly G. & P. Badin (2002). Seeing tongue movements from outside. International Conference on Speech and Language Processing, Boulder - Colorado: 1913-1916. (.pdf)
Bailly G. & B. Holm (2002). Learning the hidden structure of speech: from communicative functions to prosody. Symposium on Prosody and Speech Processing, Tokyo, Japan: 113-118. (DOI)

2001

Elisei, F., M. Odisio, G. Bailly & P. Badin (2001). Creating and controlling video-realistic talking heads. Auditory-Visual Speech Processing Workshop, Scheelsminde, Denmark: 90-97. (.ps)
Bailly G. (2001). Audiovisual speech synthesis. ETRW on Speech Synthesis, Perthshire - Scotland: 1-10.
Bailly G. (2001). Close shadowing natural vs synthetic speech. ETRW on Speech Synthesis, Perthshire - Scotland: 87-90. (DOI, .pdf)

2000

Revéret, L., G. Bailly & P. Badin (2000). MOTHER: a new generation of talking heads providing a flexible articulatory control for video-realistic speech animation. International Conference on Speech and Language Processing, Beijing - China: 755-758. (DOI)
Revéret, L., G. Bailly, P. Borel & P. Badin (2000). Analyse par la synthèse d'un visage 3D parlant : inversion opto-articulaire. Journées d'Etudes sur la Parole, Aussois, France: 125-128.
Holm, B. & G. Bailly (2000). Generating prosody by superposing multi-parametric overlapping contours. Proceedings of the International Conference on Speech and Language Processing, Beijing, China: 203-206. (DOI)
Holm, B. & G. Bailly (2000). Génération de la prosodie par superposition de contours chevauchants: application à l'énonciation de formules mathématiques. Journées d'Etudes sur la Parole, Aussois - France: 113-116. (.ps)
Borel, P., P. Badin, L. Revéret & G. Bailly (2000). Modélisation articulatoire linéaire 3D d'un visage pour une tête parlante virtuelle. Journées d'Etude de la Parole, Aussois, France: 121-124. (.ps)
Bailly G. (2000). Evaluation des systèmes d'analyse-modification-synthèse de parole. Journées d'Etudes sur la Parole, Aussois - France: 109-112. (.ps)
Bailly G., E. R. Banga, A. Monaghan & E. Rank (2000). The Cost258 Signal Generation Test Array. Second International Conference on Language Resources and Evaluation, Athens - Greece: 651-654. (.pdf)
Badin P., P. Borel, G. Bailly, L. Revéret, M. Baciu & C. Segebarth (2000). Towards an audiovisual virtual talking head: 3D articulatory modeling of tongue, lips and face based on MRI and video images. Proceedings of the 5th Speech Production Seminar, Kloster Seeon - Germany: 261-264. (.ps)

1999

Morlec Y., G. Bailly & V. Aubergé (1999). Training an application-dependent prosodic corpus model and evaluation. Proceedings of the European Conference on Speech Communication and Technology, Budapest, Hungary: 1643-1646. (DOI)
Holm B., G. Bailly & C. Laborde (1999) Performance structures of mathematical formulae. International Congress of Phonetic Sciences, San Francisco, USA: 1297-1300. (.pdf)
Badin ,P., G. Bailly & L.-J. Boé (1999). Speech production models and virtual talking heads useful aids for pronunciation training, InStill, Besançon, France. (.ps)
Bailly G. (1999) Accurate estimation of sinusoidal parameters in an harmonic+noise model for speech synthesis, Proceedings of the European Conference on Speech Communication and Technology, Budapest, Hungary: 1051-1054. (DOI

1998

Neagu, A. & G. Bailly (1998) Collaboration vs. competition between burst transition cues for the perception and identification of French stops, Proceedings of the International Conference on Speech and Language Processing, Sydney, Australia: 2127-2130. (DOI)
Morlec, Y., A. Rilliard, G. Bailly & V. Aubergé (1998) Evaluating the adequacy of synthetic prosody in signaling syntactic boundaries: methodology and first results, First International Conference on Language Resources and Evaluation, Granada, Spain. (.ps)
Bailly G., P. Badin & A. Vilain (1998) Contribution de la mâchoire à la géométrie de la langue dans les modèles articulatoire statistiques. Journées d'Etudes sur la Parole, Martigny, Suisse: 287-290. (.ps)
Bailly G., P. Badin & A. Vilain (1998) Synergy between jaw and lips/tongue movements : consequences in articulatory modelling, International Conference on Speech and Language Processing, Sydney, Australia: 417-420. (.pdf)
Badin, P., L. Pouchoy, G. Bailly, M. Raybaudi, C. Segebarth, J.-F. Lebas, M. Tiede, E. Vatikiotis-bateson & Y. Tohkura (1998) Un modèle articulatoire tridimensionnel du conduit vocal basé sur des données IRM. Journées d'Etudes sur la Parole, Martigny, Suisse: 283-286.
Badin, P., G. Bailly, M. Raybaudi & C. Segebarth (1998) a three-dimensional linear articulatory model based on MRI data, International Conference on Speech and Language Processing, Sydney, Australia: 417-420. (.pdf)
Badin, P., G. Bailly, M. Raybaudi & C. Segebarth (1998) a three-dimensional linear articulatory model based on MRI data, ESCA/COCOSDA Workshop on Speech Synthesis, Jenolan Caves, Australia: 249-254.
Badin P., G. Bailly & L.-J. Boé (1998) towards the use a virtual talking head and of speech mapping tools for pronunciation training, ESCA Tutorial and Research Workshop on Speech Technology in Language Learning, Stockholm – Sweden

1997

Neagu, A. & G. Bailly (1997) Relative contributions noise burst and vocalic transitions to the perceptual identification of stop consonants, European Conference on Speech Communication and Technology, Rhodes - Greece: 2175-2178. (.pdf)
Morlec, Y., G. Bailly & V. Aubergé (1997) Generating the prosody of attitudes, ETRW Workshop on Prosody, Athens - Greece: 251-254. (.pdf)
Morlec, Y., G. Bailly & V. Aubergé (1997) Synthesising attitudes with global rhythmic and intonation contours, European Conference on Speech Communication and Technology, Rhodes - Greece: 219-222. (.pdf)
Mawass, K., P. Badin & G. Bailly (1997) Synthesis of fricative consonants by audiovisual-to-articulatory inversion, European Conference on Speech Communication and Technology, Rhodes - Greece: 1359-1362. (DOI)

1996

Neagu, A. & G. Bailly (1996) R1, R2 et R3 : un ensemble robuste de paramètres pour la caractérisation des espaces vocaliques. Journées d'Etudes sur la Parole, Avignon-France: 247-250.
Morlec Y., G. Bailly & V. Aubergé (1996) Un modèle connexionniste modulaire pour l'apprentissage des gestes intonatifs. Journées d'Etudes sur la Parole, Avignon-France: 207-210. (.ps)
Morlec Y., G. Bailly & V. Aubergé (1996) Generating intonation by superposing gestures. International Conference on Speech and Language Processing, Philadelphia - USA: 283-286. (.pdf)
Beautemps D., P. Badin, G. Bailly, A. Galvàn & R. Laboissière (1996) Evaluation of an articulatory-acoustic model based on a reference subject. ETRW on Speech Production: from Control Strategies to acoustics, Autrans - France: 45-48.
Bailly G. (1996) Emergence de prototypes sensori-moteurs à partir d'exemplaires audio-visuels. Journées d'Etudes sur la Parole, Avignon-France: 87-90.
Bailly G. (1996). Building sensori-motor prototypes from audio-visual exemplars. International Conference on Speech and Language Processing (ICSLP), Philadelphia - USA: 957-960.
Badin, P., K. Mawass, G. Bailly, C. Vescovi D. Beautemps & X. Pelorson (1996) Articulatory synthesis of fricative consonants : data and models, ETRW on Speech Production: from Control Strategies to acoustics, Autrans - France: 221-224.
Bailly G. (1996) Sensori-motor control of speech movements. ETRW on Speech Production Modelling: from Control Strategies to acoustics, Autrans: 145-154.

1995

Morlec, Y., V. Aubergé & G. Bailly (1995) Evaluation automatic generation of prosody with a superposition model, International Congress of Phonetic Sciences, Stockholm - Sweden: 224-227. (.ps)
Bailly G., L.-J. Boé, N. Vallée & P. Badin (1995) Articulatori-acoustic prototypes for speech production, European Conference on Speech Communication and Technology, Madrid - Spain: 1913-1916.
Bailly G. (1995) Recovering place of articulation for occlusives in vcvs, International Congress of Phonetic Sciences, Stockholm - Sweden: 230-233.
Badin P., B. Gabioud, D. Beautemps, T. Lallouache, G. Bailly, S. Maeda, J.-P. Zerling & G. Brock (1995) Cineradiography of vcv sequences articulatory-acoustic data for a speech production model, International Congress on Acoustics, Trondheim - Norway: 349-352. ().pdf)
Aubergé V. & G. Bailly (1995) Generation of intonation a global approach, European Conference on Speech Communication and Technology, Madrid: 2065-2068. (.pdf)

1994

Bonnyman, J.M., M. Curtis & G. Bailly (1994) a neural network application for the analysis and synthesis of multilingual speech, IEEE International Conference on Speech, Image Processing and Neural Networks, Hong Kong: 327-330.
Bailly G., E. Castelli & B. Gabioud (1994) building prototypes for articulatory speech synthesis, International Workshop on Speech Synthesis, New Paltz - New York: 9-12.
Barbosa, P. & G. Bailly (1994) generating pauses within the z-score model, ETRW on Speech Synthesis, New Paltz - New York: 101-104.

1993

Bailly G. (1993) Resonances as possible representation of speech in the auditory-to-articulatory transform. European Conference on Speech Communication and Technology, Berlin, 1511-1514. (.pdf)
Alissali, M. & G. Bailly (1993) Compost: a client-server model for applications using text-to-speech. European Conference on Speech Communication and Technology, Berlin: 2095-2098.

1992

Barbosa, P. & G. Bailly (1992) Génération automatique des P-centers. Journées d'Etudes sur la Parole, Bruxelles, Belgique: 357-361.
Barbosa, P. & G. Bailly (1992) Generating segmental duration by P-centers, Fourth Rhythm Workshop: Rhythm Perception and Production, Bourges, France, Ville de Bourges: 163-168

1991

Guerti, M. & G. Bailly (1991) Synthesis-by-rule using compost modelling resonance trajectories. European Conference on Speech Communication and Technology, Genova - Italy: 43-46.
Bailly G. & M. Guerti (1991). Synthesis-by-rule for French. International Congress of Phonetic Sciences, Aix-en-Provence, France: 506-509.
Alissali, M. & G. Bailly (1991). COMPOST: un serveur de synthèse multilingue. 8e Congrès sur la Reconnaissance de Formes et l'Intelligence Artificielle, Lyon-Villeurbanne: 183-192.

1990

Yé, H., S. Wang, G. Bailly & F. Robert (1990). Exploration of temporal processing of a sequential network for speech parameter estimation. Applications of Artificial Neural Networks, Orlando, Florida: 16-20.
Wang, H., G. Bailly & D. Tuffelli (1990). Automatic segmentation and alignment of continuous speech based on the temporal decomposition model. International Conference on Speech and Language Processing, Kobe, Japan: 457-460.
Barbe, T. & G. Bailly (1990). Evaluation d'un détecteur de fréquence fondamentale du signal microphonique par comparaison à une référence laryngographique. Journées d'Etudes sur la Parole, Montréal: 165-169.
Bailly G. & M. Guerti (1990). Anticipation et rétention dans les mouvements vocaliques en Français. Journées d'Etudes sur la Parole, Montréal - Canada: 292-295.
Bailly G., T. Barbe & H. Wang (1990). Automatic labelling of large prosodic databases: tools, methodology and links with a text-to-speech system. ETRW Workshop on Speech Synthesis, Autrans - France: 201-204.
Bailly G. (1990). Robotics in speech production: Motor control theory. ETRW Tutorial Day on Speech Synthesis, Autrans - France, 17-26.

1989 and before

Bailly G. & A. Tran (1989). COMPOST: a rule-compiler for speech synthesis. European Conference on Speech Communication and Technology: 136-139.
Bailly G., P.-F. Marteau & C. Abry (1989). A new algorithm for temporal decomposition of speech. Application to a numerical model of coarticulation. IEEE International Conference on Acoustics, Speech, and Signal Processing, Glasgow, Scotland: 508-511.
Marteau P.-F., G. Bailly & M.-T. Janot-Giorgetti (1988). Stochastic model of diphone-like segments based on trajectory concepts. IEEE International Conference on Acoustics, Speech, and Signal Processing, New York - USA: 615-618.
Bailly G., G. Murillo, O. A. Dakkak & B. Guérin (1988). A text-to-speech synthesis system for French by formant synthesis. 7th FASE Symposium: 225-260.
Dakkak, O. A., G. Murillo, G. Bailly & B. Guérin (1987). Using Contextual Information in View of Formant Analysis Improvement. Recent advances in Speech Understanding and dialog systems, Bad Windsheim - Germany, NATO ASI Series
Dakkak O. A., G. Murillo & G. Bailly (1987). Automatic extraction of formant parameters using a-priori knowledge. IASTED, Applied Control Filtering and Signal Processing, Geneva – Switzerland
Bailly G. & J. Liu (1987). Détection d'indices par quantification vectorielle et réseaux Markoviens. Journées d'Etudes sur la Parole, Hammamet - Tunisie, GALF: 60-63.
Bailly G. (1986). Détection du fondamental par AMDF et programmation dynamique. Journées d'Etudes sur la Parole, Aix-en-Provence - France, GALF: 285-288.
Bailly G. (1986). Un modèle de congruence relationnel pour la synthèse de la prosodie du français. Journées d'Etudes sur la Parole, Aix-en-Provence - France, GALF: 75-78.

International conferences without review process

Bosse M.-L., G. Bailly and E. Gerbier (2015). Acquisition de l’orthographe d’un mot nouveau pendant la lecture d’un texte : une étude oculométrique en lecture synchrone. Symposium International sur la Littératie à l’Ecole (SILE), Sherbrooke, Canada.
Gerbier, E., G. Bailly and M.-L. Bosse (2015). The effect of audiovisual synchronization in reading while listening to texts: An eye-tracking study. Conference of the European Society for Cognitive Psychology (ESCOP), Paphos, Cyprus.
Hueber, T., A. Ben Youssef, G. Badin, G. Bailly & F. Elisei (2011). Articulatory-to-acoustic mapping: application to silent speech interface and visual articulatory feedback. Pan-European Voice Conference (PEVOC). Marseille.
Badin, P., F. Elisei, L. Huang, Y. Tarabalka & G. Bailly (2008). Vision of tongue in augmented speech: contribution to speech comprehension and visual tracking strategies. In Speech and Face-to-Face communication - A workshop / Summer School dedicated to the Memory of Christian Benoît: pp.97. Grenoble, France, 27-29 october 2008.
Bailly G., O. Govokhina & G. Breton (2008). Multimodal control of talking heads. Acoustics. Paris.
Laboissière R., J.-L. Schwartz & G. Bailly (1991). Motor control for speech skills a connectionist approach. Proceedings of the 1990 Connectionist Models Summer School. D. S. Touretzky, J. L. Elman, T. J. Sejnowski and G. E. Hinton. San Mateo, CA, Morgan Kaufmann: 319-327
Wang H., G. Bailly & D. Tuffelli (1990). Automatic segmentation and alignment of continuous speech based on the temporal decomposition model, Journal of the Acoustical Society of America: S106.
Bailly G., M. Jordan, M. Mantakas, J.-L. Schwartz, M. Bach & M. Olesen (1990). Simulation of vocalic gestures using an articulatory model driven by a sequential neural network, Journal of the Acoustical Society of America: S105.

National conferences

Lenglet M., O. Perrotin & G. Bailly (2023) A Closer Look at Latent Representations of End-to-end TTS Models, Journée commune AFIA-TLH / AFCP – “Extraction de connaissances interprétables pour l’étude de la communication parlée”, Avignon, France. (HAL)
Briglia, A., E. Godde, C. Boggio, M.-L. Bosse & G. Bailly (2023) ELARGIR : s’entraîner à lire avec fluence et expressivité. Rencontres des Jeunes Chercheurs en Parole (RJCP, Grenoble, France. (HAL)
Koelsch, L., G. Bailly , F. Elisei, P. Huguet & L. Ferrand (2021) L’impact des robots sur notre cognition: l’effet de présence robotique, Workshop sur les Affects, Compagnons Artificiels et Interactions (WACAI), Oléron, France. (HAL)
Godde, E., G. Bailly & M.-L. Bosse (2019). Un Karaoké pour Entraîner la Prosodie en Lecture, Conférence sur les Environnements Informatiques pour l’Apprentissage Humain (EIAH), Paris, France: pp. 363-366. (HAL)
Nguyen D.C., F. Elisei & G. Bailly (2016). Demonstrating to a humanoid robot how to conduct neuropsychological tests. Journées Nationales de Robotique Humanoïde (JNRH), Toulouse, France: pp.10-12. (HAL)
Gerbier, E., G. Bailly and M.-L. Bosse (2015). Effet de la synchronie audio-visuelle en lecture sur les mouvements oculaires. Congrès National de la Société Française de Psychologie, Strasbourg, France.
Bailly G., M. Chetouani, M. Ochs, A. Pauchet and H. Fiorino (2014). Virtual conversational agents and social robots: converging challenges. Workshop Affect, Compagnon Artificiel, Interaction, Rouen, France.
Mihoub, A., G. Bailly and C. Wolf (2014). Modeling sensory-motor behaviors for social robots. Workshop Affect, Compagnon Artificiel, Interaction, Rouen, France.
Fagel, S. and G. Bailly (2010). Speech, gaze and head motion in a face-to-face collaborative task. Electronic Speech Signal Processing (ESSV). Berlin. (HAL; .pdf)
Ben Youssef, A., V.-A. Tran, P. Badin and G. Bailly (2009). HMMs and GMMs based methods in acoustic-to-articulatory speech inversion. 8èmes Rencontres Jeunes Chercheurs en Parole, Avignon - France, A182.
Badin, P., G. Bailly, F. Elisei, L. Lamalle, C. Savariaux & A. Serrurier (2007). Têtes parlantes audiovisuelles virtuelles: données, modèles et applications. Congres de la Société Française de Phoniatrie, Paris.
Casari A., F. Elisei, G. Bailly & S. Raidt (2006). Contrôle du regard et des mouvements des paupières d'une tête parlante virtuelle. Workshop sur les Agents Conversationnels Animés (WACA), Toulouse - France. (.pdf)
Raidt, S., G. Bailly & F. Elisei (2006). Plateforme expérimentale de capture-restitution croisée pour l'étude de la communication face-à-face. Workshop sur les Agents Conversationnels Animés, Toulouse - France. (.pdf)
Picot, A., G. Bailly, F. Elisei & S. Raidt (2006). Scrutation de scènes naturelles par un agent conversationnel animé. Workshop sur les Agents Conversationnels Animés, Toulouse - France.(.pdf)
Alami, R., G. Bailly, L. Brèthes, R. Chatila, A. Clodic, J. Crowley, P. Danès, F. Elisei, S. Fleury, M. Herrb, F. Lerasle, & P. Menezes (2005) HR+: Towards an interactive autonomous robot. in Journées du programme interdisplinaire ROBEA. Montpellier - France. (.pdf)
Alami, R., G. Bailly & J. Crowley (2004) HR+: Pour une interaction homme-robot autonome. Journées du programme interdisplinaire ROBEA, Toulouse – France. (.pdf)
Bas, P., J. Lienard, J.-M. Chassery, D. Beautemps & G. Bailly (2003). Artus: animation réaliste par tatouage audiovisuel à l'usage des sourds. Journée Nationale sur « Image et Signal pour le Handicap », Paris. (.pdf)
Niswar, A., G. Bailly & K. Kroschel (2003) Construction of an individualized visual speech synthesizer from orthogonal 2d images. Institute for Automation and Robotics, Duisburg – Germany. (.pdf)
Alami R., G. Bailly & J.-L. Crowley (2002) HR+: Pour une interaction homme-robot autonome. Journées du programme interdisplinaire ROBEA, Toulouse - France: 39-40.
Elisei F., G. Bailly, M. Odisio & P. Badin (2001) Clones parlants vidéo-réalistes: application à l'interprétation de FAP-MPEG4. Colloque sur la Compression et Représentation des Signaux Audiovisuels (CORESA), Dijon - France: 145-148. (.ps)
Odisio M., F. Elisei, G. Bailly & P. Badin (2001) Clones parlants vidéo-réalistes: application à l'analyse de messages audiovisuels. Colloque sur la Compression et Représentation des Signaux Audiovisuels (CORESA), Dijon - France: 141-144. (.ps)
Bailly G., F. Elisei & P. Badin (2001) Télécommunications virtuelles et clones parlants. Premières Rencontres des Sciences et Technologies de l'Information (ASTI), La Villette - France: 61.
Bailly G., L. Revéret, P. Borel & P. Badin (2000) hearing by eyes thanks to the labiophone exchanging speech movements, COST254 workshop: Friendly Exchanging Through The Net, Bordeaux – France. (.ps)
Rilliard A., V. Aubergé, G. Bailly & Y. Morlec (1997) Vers une mesure de l'information linguistique véhiculée par la prosodie. 1ères JST FRANCIL, Avignon-France: 481-487.
Morlec Y., G. Bailly & V. Aubergé (1997) Apprentissage automatique d'un module de génération multistyle de l'intonation. 1ères JST FRANCIL, Avignon-France: 407-412.
d'Alessandro, C., V. Aubergé, G. Bailly, F. Béchet, P. Boula de Mareuil, S. Foukia, J.-P. Goldman, J. F. Isabelle, E. Keller, A. Marchal, P. Mertens, V. Pagel, D. O'Shaughnessy, G. Richard, M. H. Talon, E. Wehrli & F. Yvon (1997). Vers l'évaluation de systèmes de synthèse de parole à partir du texte en français. 1ères JST Francil de l'AUPEL-UREF, Avignon - France: 393-398.
Bailly G. (1995). Pistes de recherches en synthèse de la parole. École Thématique: fondements et perspectives en traitement automatique de la parole. H. Méloni. Marseille - Luminy - France, Université d'Avignon et des Pays de Vaucluse: 211-220.
Laboissière, R., J.-L. Schwartz & G. Bailly (1992) modélisation du contrôle moteur en production de la parole vers un robot parlant, Cinquième Colloque de l'ARC, Nancy, France: 115-124.
Bailly G., M. Bach, M. Olesen, J.-L. Schwartz & A. Morris (1990). Génération de trajectoires articulatoires par réseau séquentiel. 5èmes Journées NSI, Aussois - France: 191-192.
Bailly G. & J. Liu (1989). Détection de formants par quantification vectorielle et réseaux Markoviens. Actes du séminaire Décodage Acoustico-Phonétique, Nancy - France, Greco Dialogue Homme-Machine: 89-94.

Academic documents

Bailly G. (2000). Représentations phonétiques et technologies vocales. Habilitation à Diriger des Recherches. Grenoble, Institut National Polytechnique. (.pdf).
Bailly G. (1983). Contribution à la détermination automatique de la prosodie du Français parlé à partir d'une analyse syntaxique. Établissement d'un modèle de génération, Institut National Polytechnique, Grenoble - France