Bailly G., P. Perrier &
Vatikiotis-Bateson, E. (2012) Audiovisual Speech
Processing. Cambridge, UK. Cambridge University
Press, 506 pages. (HAL)
Keller E., G.
Bailly, A. I. C. Monaghan, J. Terken & M.
Huckvale (2002) Improvements
in Speech Synthesis. Chichester, England, J. Wiley
&
Sons, Ltd, 393 pages.
Bailly, G.
& C. Benoît (1992) Talking Machines:
Theories,
Models and Designs. Amsterdam, North-Holland, 523 pages.
Conferences & workshops organized
Bailly G., O. Perrotin, T. Hueber, D. Lolive & N. Obin (2023) 12th Speech Synthesis Workshop, 26-28 August, Grenoble - France.
Perrotin P., Bailly G. & S. King (2023) Blizzard Challenge, 29 August, Grenoble - France.
Bailly G., G. Skantze & S. Al Moubayed (2017) Speech and HRI, Interspeech, Stockholm, Sweden.
Kim J. & G. Bailly (2016) Auditory-visual expressive speech and gesture in humans and machines, Interspeech, San Francisco, CA.
Fagel S., B. J. Theobald & G. Bailly (2009) LIPS 2009: Visual Speech Synthesis Challenge, AVSP, Norwich - UK.
Fagel S., B. J. Theobald & G. Bailly
(2008)
LIPS 2008:
Visual Speech Synthesis Challenge, Interspeech,
Brisbane - Australia.
Bailly G.
(2008) Talking heads and pronunciation
training, Interspeech,
Brisbane - Australia.
Bailly G.
& G. Potamianos (2008) Multimodal speech
technology, Acoustics,
Paris.
Bailly G.,
N. Campbell & B. Mobius (2003)
Hot Topics in Speech Synthesis, EuroSpeech,
Geneva, Switzerland.
Guest editor
Kim J., G.
Bailly & C. Davis (2018) "Auditory-visual
expressive speech and gesture in humans and machines", Speech
Communication, (CfP).
Dohen M., Schwartz J.-L. & G.
Bailly
(2010) "Speech and Face-to-Face Communication", Speech
Communication,
52(3): 598–612. (HAL;
.pdf).
Fagel S., G. Bailly & B. J.
Theobald (2009)
"Animating Virtual Speakers or Singers from Audio: Lip-Synching Facial
Animation", EURASIP
Journal on Audio, Speech, and Music Processing, 2009(ID
826091): 2 pages. (HAL)
Campbell N., W. Hamza, H. Höge, J. Tao & G.
Bailly
(2006)
"Special section on expressive speech synthesis", IEEE Trans.
on Audio,
Speech, and Language Processing, 14(4): p.1097-1098.
Reviewed journal articles
Birulès J., A. Duroyal, A. Vilain, G. Bailly & M. Fort (submitted)
"French speakers favor prosody over statistics to segment speech", Language Learning
Lenglet, M. O. Perrotin & G. Bailly (submitted) "A Closer Look at Internal Representations of End-to-End Text-to-Speech Models: How is Phonetic and Acoustic Information Encoded?", Cumputer, Speech and Language. Available at SSRN
Fournier H., S. Alisamir, S. Azzakhnini, I. Zsoldos, E. Trân, G. Bailly, F. Elisei, B. Bouchot, B., P. Constant, J. Fruitet, F. Tarpin-Bernard,
S. Rossato, F. Portet, O. Koenig, H. Chainay & F. Ringeval (2025) "THERADIA WoZ: An Ecological Corpus for Appraisal-based Affect Research in Healthcare", IEEE Transactions on Affective Computing, DOI (HAL)
Haefflinger L., F. Elisei & G. Bailly (2025) "Data-Driven Control of Eye and Head movements for Triadic Human-Robot Interactions", International Journal of Social Robotics.
DOI (HAL)
Perrotin O., B. Stephenson, S. Gerber, G. Bailly & S. King (2025) "Refining the Evaluation of Speech Synthesis: A Summary of the Blizzard Challenge 2023", Computer, Speech & Language, 90, 101747. DOI (HAL)
Fournier H., S. Alisamir, S. Azzakhnini, H. Chainay, O. Koenig, I. Zsoldos, E. Trân, G. Bailly, F. Elisei, B. Bouchot, B. Varini, P. Constant, Y. Fruitet, F. Tarpin-Bernard, S. Rossato, F. Portet & F. Ringeval (submitted) THERADIA WoZ: An Ecological Corpus for Appraisal-based Affect Research in Healthcare, IEEE Transactions on Affective Computing (HAL)
Zsoldos I., E. Trân, H. Fournier, F. Tarpin-Bernard, J. Fruitet, M. Fouillen, G. Bailly, F. Elisei, B. Bouchot, P. Constant, F. Ringeval, O. Koenig & H. Chainay (2024)
"The value of a virtual assistant to improve engagement in computerized cognitive training at home: An exploratory study", JMIR Rehabilitation and Assistive Technologies, 11, e48129. DOI (HAL)
Bailly G., E. Godde, A.-L. Piat-Marchand & M.-L. Bosse (2022)
"Automatic assessment of aloud readings of young pupils", Speech Communication, 138:67-79. DOI. (HAL)
Godde E., G. Bailly & M.-L. Bosse (2022)
"Pausing and breathing while reading aloud: development from 2nd to 7th grade", Reading and Writing, 35:1-27.
DOI. (HAL)
Godde E., M.-L. Bosse & G. Bailly (2021)
"Echelle Multi-Dimensionnelle de Fluence: nouvel outil d'évaluation de la fluence en lecture prenant en compte la prosodie, étalonné du CE1 à la 5ème", L'année psychologique, 12:19-43, DOI. (HAL)
Godde E., M.-L. Bosse & G. Bailly (2020)
"A review of reading prosody acquisition and development", Reading and Writing, 33(2), 399-426.
DOI. (HAL)
Kim J., G. Bailly & C. Davis (2018) "Introduction to the Special Issue on Auditory-visual expressive speech and gesture in humans and machines", Speech Communication, 63-67. DOI. (HAL)
Gerbier E., G. Bailly & M.-L. Bosse (2018) "Audiovisual Synchronization in Reading while Listening to Texts: Effects on Visual Behavior and Verbal Learning", Computer, Speech and Language, 47:74-92. DOI. (HAL)
Barbulescu A., R. Ronfard & G. Bailly (2017) "Exercises in Speaking Style: A Generative Audiovisual Prosodic Model for Virtual Actors", Computer Graphics Forum, 37-6:40-51. DOI.
(HAL)
Barbulescu A., R. Ronfard & G. Bailly (2017) "Which prosodic features contribute to the recognition of dramatic attitudes?", Speech Communication, 95:78-86. DOI. (HAL)
Nguyen, D.A., G. Bailly & F. Elisei (2017) "Learning Off-line vs. On-line Models of Interactive Multimodal Behaviors with
Recurrent Neural Networks", Pattern Recognition Letters, 100C:29-36. DOI. (HAL)
Bailly G. (2016) "Critical review of the book Gaze in Human-Robot Communication", Journal on Multimodal User Interfaces, 1-2. DOI. (HAL)
Mihoub A., G.
Bailly, C. Wolf & F. Elisei (2016) "Graphical
models for social behavior modeling in face-to face interaction" Pattern Recognition
Letters, 74:82-89. DOI.
(HAL)
Parmiggiani A., M. Randazzo, M. Maggiali, G.
Metta, F. Elisei & G.
Bailly (2015) "Design and Validation of a
Talking Face for the iCub", International Journal of Humanoid Robotics,
1550026:1-20, DOI.
(HAL)
Hueber T., L. Girin, X. Alameda & G.
Bailly(2015) "Speaker-adaptive
acoustic-articulatory inversion using cascaded Gaussian mixture
regression", Transactions
on Audio, Speech and Language Processing, 23(12):
2246-2259. DOI.
(HAL)
Mihoub A., G.
Bailly, C. Wolf & F. Elisei (2015) "Learning
multimodal
behavioral models for face-to-face social interaction", Journal on
Multimodal User Interfaces (JMUI) 9(3): 195-210. DOI.
(HAL)
Hueber T. & G.
Bailly
(2015) "Statistical
Conversion of Silent Articulation into Audible Speech using
Full-Covariance HMM", Computer,
Speech and Language, 36: 274–293. DOI.
(HAL)
Badin P., C. Savariaux, G. Bailly,
F. Elisei & L.-J. Boë (2012)
"Caractérisation des
mécanismes de production de la parole: une approche
biométrique et modélisatrice mono-locuteur et
multi-dispositifs", Biométrie
Humaine et Anthropologie -
Hommage
à Bernard Teston, 30(1-2): 67-77. (HAL)
Boucher J.-D., U. Pattacini, A. Lelong, G.
Bailly,
P. F. Dominey, F. Elisei, S. Fagel & J. Ventre-Dominey (2012)
"I reach faster when I see you look: Gaze effects in human-human and
human-robot face-to-face cooperation", Frontiers in
neurorobotics 6(3). DOI.
(HAL;
.pdf).
Heracleous P., P. Badin, G. Bailly
& N. Hagita (2011) "A pilot study on augmented speech
communication
based on Electro-Magnetic Articulography", Pattern Recognition
Letters,
32: 1119-1125. DOI.
(HAL).
Bailly, G.,
S. Raidt & F. Elisei (2010)
"Gaze, conversational agents and face-to-face communication", Speech
Communication - special issue on Speech and Face-to-Face Communication,
52(3): 598–612. (HAL;
.pdf).
Badin P., Y. Tarabalka, F. Elisei & G.
Bailly
(2010) "Can you read tongue movements? Evaluation of the
contribution of tongue display to speech understanding", Speech
Communication - special issue on
Speech and Face-to-Face Communication, 52(3):
493-503. (HAL;
.pdf).
Tran V.-A., G.
Bailly & H. Loevenbruck (2010)
"Improvement to a NAM-captured whisper-to-speech system", Speech
Communication - special issue on Silent Speech Interfaces,
52(4): 314-326. (HAL;.pdf).
Bailly G.,
O. Govokhina, F. Elisei & G. Breton
(2009) "Lip-synching using speaker-specific articulation, shape and
appearance models", Journal
of Acoustics, Speech and Music Processing. Special issue on "Animating
Virtual Speakers or Singers from Audio: Lip-Synching Facial Animation",
ID 769494: 11 pages. ( .pdf).
Heracleous P., D. Beautemps, V.-A. Tran, H. Loevenbruck & G. Bailly (2009) "Exploiting visual information for NAM recognition", IEICE Electronics Express, 6(2):77-82. (HAL;.pdf).
Bailly G.,
F. Elisei & S. Raidt (2008) "Boucles
de perception-action et interaction face-à-face" Revue Française de Linguistique Appliquée, XIII(2):121-131. (HAL;.pdf).
Badin P., F. Elisei, G.
Bailly, C. Savariaux, A. Serrurier & Y. Tarabalka
(2007)
"Têtes parlantes audiovisuelles virtuelles:
Données et modèles articulatoires -
applications" Revue de Laryngologie, 128(5),
289-295.
(.pdf).
Caplier A., S. Stillittano, O. Aran, L. Akarun, G.
Bailly, D.
Beautemps, N. Aboutabit, N. & T. Burger (2007)
"Image and video for hearing impaired people", EURASIP
Journal on Image and Video Processing (electronic journal),
2007, 14 p.
(.pdf).
Bailly G., C. Baras, P. Bas, S. Baudry, D. Beautemps, R. Brun,
J.-M.
Chassery, F. Davoine, F. Elisei, G. Gibert, L. Girin, D. Grison, J.-P.
Léoni, J. Liénard, N. Moreau & P. Nguyen
(2007)
"ARTUS: synthesis and audiovisual watermarking of the movements of a
virtual agent interpreting subtitling using cued speech for deaf
televiewers", AMSE - Advances in Modelling, 67:
177-187.
(.pdf).
Bérar
M., M. Desvignes, G.
Bailly & Y. Payan (2006)
"3D semi landmarks-based statistical face reconstruction", International
Journal of Computing and Information Technology, 14(1):
31-43.
(.pdf).
Bailly
G. & B. Holm (2005)
"SFC: a trainable prosodic model", Speech Communication,
Special
issue on Quantitative Prosody Modelling for Natural Speech
Description and Generation - Edited by K. Hirose, D. Hirst and Y.
Sagisaka)
46(3-4): 348-364.
(.pdf).
Gibert G., G. Bailly, D. Beautemps, F. Elisei & R. Brun (2005)
"Analysis and synthesis of the 3D movements of the head, face and hands of a speech cuer", Journal of the Acoustical Association of America, 118(2): 1144-1153.
(DOI, .pdf).
Odisio M., G. Bailly & F. Elisei (2004)
"Tracking talking faces with shape and appearance models", Speech Communication, 44: 63-82.
(DOI, .pdf).
Bailly
G., M. Bérar, F. Elisei & M. Odisio (2003)
"Audiovisual speech synthesis", International Journal of Speech Technology, 6: 331-346.
(DOI, .pdf).
Bailly G. (2003)
"Close shadowing natural versus synthetic speech", International Journal of Speech Technology, 6(1): 11-19.
(DOI, .pdf).
Apostol L., P. Perrier & G. Bailly (2003)
"A model of acoustic interspeaker variability based on the concept of formant-cavity affiliation", Journal of Acoustical Society of America, 115(1): 337-351.
(DOI, .pdf).
Bailly G. & B. Holm (2002) "Learning the hidden structure of speech: from communicative functions to prosody", Cadernos de Estudos Linguisticos, 43:
37-54. (DOI)
Badin P., G. Bailly, L. Revéret, M. Baciu, C. Segebarth & C. Savariaux (2002)
"Three-dimensional linear articulatory modeling of tongue, lips and
face
based on MRI and video images", Journal of Phonetics,
30(3): 533-553.
(DOI, .pdf).
Morlec Y., G.
Bailly & V. Aubergé (2001)
"Generating prosodic attitudes in French: data, model and evaluation", Speech
Communication, 33(4): 357-371.
(DOI, .pdf).
Beautemps
D., P. Badin & G.
Bailly (2001)
"Degrees of freedom in speech production: analysis of cineradio- and
labio-films data for a reference subject, and articulatory-acoustic
modeling", Journal of the Acoustical Society of America,
109(5): 2165-2180.
(DOI, .pdf).
Mawass K., P. Badin & G. Bailly
(2000)
"Synthesis of French Fricatives by Audio-Video to Articulatory
Inversion", Acta
Acoustica, 86: 136-146. (.pdf).
Yvon
F., P. Boula de Mareuil, C. d'Alessandro, V. Aubergé, M.
Bagein, G. Bailly,
F. Béchet, S. Foukia, J.-F. Goldman, E. Keller,
D. O'Shaughnessy, V. Pagel, F. Sannier, J. Véronis &
B.
Zellner
(1998)
"Objective evaluation of grapheme to phoneme conversion for
text-to-speech synthesis in French", Computer Speech and
Language, 12: 393-410.
(HAL, .pdf)
Bailly G.
(1998)
"Cortical dynamics and biomechanics", Bulletin de la
Communication Parlée, 4: 35-44.
Bailly G.
(1997)
"Learning to speak. Sensori-motor control of speech movements", Speech
Communication, 22(2-3): 251-267.
(DOI,.pdf).
Barbosa P. & G.
Bailly (1994)
"Characterisation of rhythmic patterns for text-to-speech synthesis", Speech
Communication, 15: 127-137.
(.ps).
Bonnyman
J., K. M. Curtis & G.
Bailly (1993) "A transputer-based recurrent neural network
for resonance tracking of
speech", Transputer Applications and Systems, 1:
1219-1228.
Boë
L.-J., P. Perrier & G.
Bailly (1992)
"The geometric vocal tract variables controlled for vowel production:
Proposals for constraining acoustic-to-articulatory inversion", Journal
of Phonetics, 20(1): 27-38.
Bailly G.
& M. Alissali (1992)
"COMPOST: a server for multilingual text-to-speech system", Traitement
du Signal, 9(4): 359-366.
Laboissière
R., J.-L. Schwartz & G.
Bailly (1991)
"Modelling the speaker-listener interaction in a quantitative model for
speech motor control: a framework and some preliminary results", PERILUS
XIV - Department of Linguistics: 57-62.
Bailly G.,
R. Laboissière & J.-L. Schwartz (1991)
"Formant trajectories as audible gestures: An alternative for speech
synthesis", Journal of Phonetics, 19(1):
9-23.
Bailly G.
& J. Liu (1990)
"Détection d'indices par quantification vectorielle et
réseaux Markoviens", Journal d'Acoustique,
3: 143-151.
Bailly G.
(1989) "Integration of rhythmic and syntactic constraints in a model of
generation of French prosody", Speech Communication,
8:
137-146.
Bailly
G., A. Perrin & Y. Lepage (1988)
"Common approaches in speech synthesis and automatic translation of
text", Bulletin du Laboratoire de la Communication
Parlée - Grenoble, 2B: 295-311.
Dakkak O.
A., G. Murillo, G. Bailly
& B. Guérin (1988) "A database of formant parameters
for knowledge extraction and
synthesis-by-rule", Bulletin du Laboratoire de la
Communication Parlée, 391-405.
Book chapters
Bailly, G., A. Mihoub, C. Wolf
&
F. Elisei (2018). Gaze
and face-to-face interaction: from multimodal data to behavioral
models. in Advances in Interaction Studies. Eye-tracking
in interaction. Studies on the role of eye gaze in dialogue.
G. Brône & B. Oben. Amsterdam, NL. John
Benjamins: 139-168. DOI.
(HAL)
Bailly, G.,
P. Badin, L. Revéret and A. B. Youssef (2012). Sensorimotor
characteristics of speech production. Audiovisual Speech
Processing. G.
Bailly, P. Perrier and E. Vatikiotis-Bateson. Cambridge, UK, Cambridge
University Press: 368-396. (HAL)
Bailly, G.,
F. Elisei & S. Raidt (2011). Des machines parlantes aux agents
conversationnels
incarnés, in Informatique
et Sciences Cognitives : influences ou confluences, D.
Kayser and C. Garbay, Editors. Ophrys: Paris: 215-234. (HAL)
Lelong, A. and G.
Bailly (2011).
Study of the phenomenon of phonetic convergence thanks to
speech dominoes Analysis
of Verbal and Nonverbal Communication and
Enactment: The Processing Issue. A. Esposito, A.
Vinciarelli, K. Vicsi,
C. Pelachaud and A. Nijholt. Berlin, Springer Verlag: 280-293. (HAL)
Fagel, S. & G.
Bailly (2010). Speech, gaze and head motion in a
face-to-face collaborative task. Toward Autonomous,
Adaptive, and Context-Aware Multimodal Interfaces: Theoretical and
Practical Issue. A. Esposito, A. M. Esposito, R.
Martone, V. C. Müller and G. Scarpetta. Berlin, Springer
Verlag, Lecture
Notes in Computer Science (LNCS). 6456: 265-274. (HAL)
Bailly, G.,
P. Badin, D. Beautemps & F. Elisei (2010). Speech technologies
for augmented communication, in Computer-Synthesized
Speech Technologies: Tools for Aiding Impairment. J.
Mullenix and S. Stern. Hershey, PA, IGI Global: 116-128.
(HAL)
Attina, V., G. Gibert, M.-A. Cathiard, G. Bailly
& D. Beautemps (2010) The Analysis of French Cued Speech
Production-Perception: Towards a complete Text-to-Cued Speech
Synthesizer, in Cued
Speech and Cued Language Development for Deaf and Hard of Hearing
Children, C. LaSasso, K.L. Crain, and J. Leybaert,
Editors. Plural Publishing, Inc.: San Diego, CA. p. 449-466.(HAL)
Bailly, G.
& C. Pelachaud (2009). Parole et expression
des émotions
sur le visage d'humanoïdes virtuels. in Traité
de la
réalité virtuelle:
Volume 5 : les humains virtuels. P. Fuchs, G. Moreau and
S. Donikian.
Paris, Presses de l'Ecole des Mines de Paris. 5: 187-208. (.pdf)
Badin, P., F. Elisei, G.
Bailly & Y. Tarabalka
(2008). An audiovisual talking head for augmented speech generation:
Models and animations based on a real speaker's articulatory data. Conference on
Articulated Motion and Deformable Objects, Mallorca,
Spain, Springer LNCS, 132-143. (.pdf)
d'Alessandro, C., P. Boula de Mareüil, M.-N.
Garcia, G.
Bailly, M. Morel, A. Raake, F. Béchet, J.
Véronis
and R.
Prudon (2008). La campagne EvaSy d'évaluation de la
synthèse de la parole à partir du texte. L'évaluation
des technologies de traitement de la langue. S. Chaudiron
and K. Choukri. Paris, Hermès: 183-208.
Bailly, G.,
F. Elisei & S. Raidt (2006). Virtual talking heads and ambiant
face-to-face communication. The fundamentals of
verbal and non-verbal communication and the biometrical issue.
A. Esposito, E. Keller, M. Marinaro and M. Bratanic. Amsterdam, IOS
Press BV. (.pdf)
Bérar, M., G.
Bailly, M. Chabanas, M. Desvignes, F. Elisei, M. Odisio
& Y. Pahan (2006) Towards a generic talking head, in Towards
a
better
understanding of speech production processes, J.
Harrington and M. Tabain, Editors. Psychology Press: New York, 341-362.
(.pdf)
Bailly, G.
(2001). Towards more versatile signal generation systems. Improvements
in Speech Synthesis. E. Keller, G. Bailly, A. I. C. Monaghan,
J. Terken & M. Huckvale. Chichester, England, J. Wiley
&
Sons, Ltd: 18-21. (.pdf)
Bailly, G.
(2001). The cost258 signal generation test array. Improvements
in Speech Synthesis. E. Keller, G. Bailly, A. I. C. Monaghan,
J. Terken & M. Huckvale. Chichester, England, J. Wiley
& Sons, Ltd: 39-51.(.pdf)
Bailly, G.
(2001). A parametric harmonic + noise model. Improvements in
Speech Synthesis. E. Keller, G. Bailly, A. I. C. Monaghan, J.
Terken and M. Huckvale. Chichester, England, J. Wiley & Sons,
Ltd: 22-38.(.pdf)
Barbosa, P. & G.
Bailly (1997). Generation of pauses within the z-score
model. Progress
in Speech Synthesis. J. P. H. V. Santen, R. W. Sproat, J. P.
Olive and J. Hirschberg. New York, Springer Verlag: 365-381.
Bailly, G.,
V. Aubergé & Y. Morlec (1997). Des
représentations
cognitives aux représentations phonétiques de
l'intonation. Polyphonie pour Iván
Fónagy. J. Perrot, L'Harmattan: 19-28.
Bailly, G.
& V. Aubergé (1997). Phonetic and phonological
representations for intonation. Progress in Speech Synthesis.
J. P. H. V. Santen, R. W. Sproat, J. P. Olive and J. Hirschberg. New
York, Springer Verlag: 435-441. (.ps)
Bailly, G.
(1997). No future for comprehensive models of intonation? Computing
prosody: Computational models for processing spontaneous speech.
Y. Sagisaka, N. Campbell and N. Higuchi, Springer Verlag: 157-164.
(.pdf)
Bailly, G.
(1996). Pistes de
recherches en synthèse de la parole. Fondements et
perspectives en traitement automatique de la parole. H.
Méloni. Paris - France, AUPELF-UREF: 109-122.
Bailly, G.
(1995). Caracterisation
of formant trajectories by tracking vocal tract resonances. Levels
in speech communication : relations and interactions. C.
Sorin, J. Mariani, H. Méloni and J. Schoentgen. Amsterdam,
Elsevier: 91-102. (.ps)
Bailly, G.,
T. Barbe & H. Wang (1992). Automatic labelling of large
prosodic
databases: tools, methodology and links with a text-to-speech system. Talking
Machines: Theories, Models and Designs. G. Bailly and C.
Benoît. Amsterdam, North-Holland: 323-333.
Bailly, G.,
C. Abry, L.-J. Boë, R. Laboissière, P. Perrier
& J.-L. Schwartz (1992). Inversion and speech recognition. Signal
Processing VI: Theories and Applications. J. Vandewalle,
R. Boîte, M. Moonen and A. Oosterlinck. Amsterdam, Elsevier. 1:
159-164.
Bailly, G.
(1989). Synthèse
de la parole. La parole et
son traitement automatique. J.-P. Tubach. Paris, Masson:
408-448.
Invited talks
Bailly, G. (2025) Téléopération et apprentissage de comportements sociaux pour un robot humanoïde conversationnel: des données aux modèles, réseau de recherche 2RSHS, SciencesConf, 12 Juin.
Bailly, G. (2024) Controlling and Probing Generative End-to-end Models: New Opportunities for Research on Prosody, SPROSIG lecture series, Youtube, 27 November.
Bailly, G. (2024) End-to-end models as a proxy to probe massive data, CSTR, Edinburgh, UK, 2 April.
Bailly, G. (2023) End-to-end models as a proxy to probe massive data, Oriental COCOSDA, Delhi, India, 12 December.
Bailly, G. (2022) Exploring latent spaces of end-to-end text-to-speech synthesis systems, International Conference on Speech and Computer (SPECOM), Dharwad, India, 14 November.
Bailly, G. (2021) Training social robots: the devil is in the details, Furhat webinar, 14 October.
Bailly, G. (2021) Characterizing and assessing the oral reading fluency of young readers, Iberspeech, Valladolid, Spain, 24 March
Bailly, G. (2020) Characterizing and assessing the reading fluency of young readers, Séminaire du groupe GETALP, LIG, Grenoble, France.
Bailly, G. (2019) Apprendre des comportements sociaux à un robot humanoïde par téléopération immersive, Inauguration Creativ’Lab CPS/Robotique du LORIA, Nancy, France.
Bailly, G. (2019) Téléopération immersive de robots humanoïdes, Journées Nationales de la Recherche en Robotique (JNRR), Vittel, France.
Elisei F.
& Bailly, G.
(2018) Téléoperation immersive d'un robot
humanoïde pour la collecte et la modélisation de
corpus d'interactions homme-robot,
Journées Neuroscience et multi-modalité dans les
interactions humain-humain, humain-agent ou humain-robot,
Paris, France.
Bailly, G.
(2018) Demonstrating, learning & evaluating socio-communicative
behaviors for HRI,
IIT Guwahati , Guwahati, India.
Bailly, G.
(2018) Demonstrating, learning & evaluating socio-communicative
behaviors for HRI, Workshop
«Emotionally Intelligent Social Robots»
organised by labcom Behaviors.AI (joint lab between LIRIS &
Hoomano) and sponsored by ARC6 ,
Lyon, France.
Bailly, G.
(2017) Demonstrating, learning & evaluating socio-communicative
behaviors for HRI, Joint
meeting of GT5/GDR Robotique & GT ACAI/GDR ISIS,
Paris, France.
Bailly, G.,
F. Elisei & D.A. Nguyen (2017) Social robots for diagnosis and
rehabilitation, Humanitas
Clinical and
Research Center, Milan, Italy.
Bailly, G.,
F. Elisei & D.A. Nguyen (2016) Providing a humanoid robot with
socio-communicative skills: the SOMBRERO approach. Xerox Research Centre
Europe (XRCE) Seminar, Meylan, France.
Bailly, G.
& M. Garnier (2016) Phonetic
convergence: data, mechanisms & motivations. Experimental
Linguistics, St Petersburg, Russia.
Bailly, G. (2016) Prosodic modelling. Experimental Linguistics, St Petersburg, Russia.
Bailly G. (2015)
Round table on the acceptability of robotic technologies. Entretiens Jacques
Cartier
ROBOTIQUE, SERVICES ET SANTÉ, Lyon, France
Bailly G.,
A. Mihoub, C.
Wolf & F. Elisei (2015) Demonstrating & learning
interactive
multimodal behaviors for humanoid robots. French-Japanese-German
Workshop on Human-Centric Robotics, Munich, Germany.
Bailly G.,
A. Mihoub, C.
Wolf & F. Elisei (2015) Learning joint multimodal behaviors for
face-to-face interaction: performance & properties of
statistical
models. Human-Robot
Interaction (HRI). Workshop on behavior
coordination between animals, humans and robots, Portland,
OR.
Bailly G. (2014) Convergence
phonétique, Journée
scientifique
annuelle "Langage & Cognition" du Pôle
Cognition (GPC), Grenoble -
France.
Bailly G. (2014)
Apprentissage
de modèles de
comportements multimodaux, Journée
scientifique "Robotique interactionnelle" de l'ARC5,
Grenoble -
France.
Bailly G. (2014)
Apprentissage
de modèles de
comportements multimodaux, Ecole
de printemps "Robotique et Interactions Sociales", Moliets
et Maa - France.
Bailly G. (2013) L’interaction
verbale et non-verbale, Journées
Nationales de la Recherche en Robotique (JNRR), Annecy -
France.
Bailly G. (2013) Parole et
expressions
faciales, Journée
LIMA/GdR ISIS sur l'émotion, Lyon -France.
Bailly G. (2013) Interaction
homme-agent conversationnel et génération de
comportements verbaux et
co-verbaux. Journée
de l'Intelligence embarquée. Université
de Cergy-Pontoise, Cergy - France.
Bailly G. (2011) Mutual
attention and
accommodation during face-to-face interaction. UWS Summer
Research
Festival, Researching Communication (ResCom) at UWS: Brain, Behaviour
and Computation, Sydney - Australia.
Bailly G.
&
A. Lelong (2010). Phonetic convergence: overview and preliminary
results. 3rd
COST 2102 International Training School: "Toward Autonomous, Adaptive,
and Context-Aware
Multimodal Interfaces: Theoretical and Practical Issues",
Caserta
- Italy.
Bailly G. (2009). Situated face-to-face communication with avatars. 2nd COST 2102 International Training School: "Development of Multimodal Interfaces: Active Listening and Synchrony", Dublin - Ireland.
Bailly G., F. Elisei, S. Raidt, A. Casari & A. Picot (2006). Embodied conversational agents: computing and rendering realistic gaze patterns. Pacific Rim Conference on Multimedia Processing, Hangzhou - China.
Bailly G., N. Campbell & B. Mobius (2003). ISCA Special Session: Hot Topics in Speech Synthesis. EuroSpeech, Geneva - Switzerland.
Bailly G. (2002). Audiovisual speech synthesis. From ground truth to models. International Conference on Speech and Language Processing, Boulder - USA.
Bailly G. (2001). Audiovisual speech synthesis. ETRW on Speech Synthesis, Perthshire - Scotland.
Bailly G. (1996) Sensori-motor control of speech movements. ETRW on Speech Production Modelling: from Control Strategies to acoustics, Autrans - France.
Bailly G. (1990). Robotics in speech production:motor control theory. ETRW Workshop on Speech Synthesis, Autrans - France.
International conferences with review process
2025
Bailly, G., E. André, E. Cooper, B. Cowan, J. Edlund, N. Harte, S. King, E. Klabbers, S. Le Maguer, Z. Malisz, R. K. Moore, B. Möbius, S. Möller, A. Pandey, O. Perrotin, F. Seebauer, S. Strömbergsson, D. R. Traum, C. Tånnander, P. Wagner, J. Yamagishi & Y. Yasuda (submitted) Hot topics in speech synthesis evaluation, Speech Synthesis Workshop (SSW), Leeuwarden, NL.
Sankar S., M. Lenglet, G. Bailly, D. Beautemps, & T. Hueber (2025) Cued Speech Generation Leveraging a Pre-trained Audiovisual Text-to-Speech Model. ICASSP, Hyderabad, India. (HAL)
2024
Lenglet M., O. Perrotin & G. Bailly (2024) FastLips: an End-to-End Audiovisual Text-to-Speech System with Lip Features Prediction for Virtual Avatars. Interspeech, Kos, Greece, pp. 3450-3454. DOI. (HAL)
Charuau D., A. Briglia, E. Godde & G. Bailly (2024) Training speech-breathing coordination in computer-assisted reading. Interspeech, Kos, Greece. pp. 5128-5132. (DOI, HAL)
Elisei F., L. Haefflinger & G. Bailly (2024) RoboTrio: an annotated multimodal corpus of the interactions of two Humans with a teleoperated robot, Communication in Human-AI Interaction (CHAI), Malmö, Sweden.
Bailly G., R. Legrand, M. Lenglet, F. Elisei, M. Garnier & O. Perrotin (2024) Emotags: Computer-assisted verbal labelling of expressive audiovisual utterances for expressive multimodal TTS, Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING), Torino, Italy. (HAL)
Charuau D., A. Briglia, E. Godde & G. Bailly (2024) Entraînement de la coordination respiration-parole en apprentissage de la lecture assistée par ordinateur. Journées d'Etudes sur la Parole, pp.351-360, Toulouse, France. (DOI, HAL)
Godde E., M.-L. Bosse & G. Bailly (2024) A reading karaoke to improve reading rate, reading prosody and comprehension, 31st Annual Conference Society for the Scientific Study of Reading (SSSR), Copenhague, Denmark. (HAL)
Briglia A., E. Godde, D. Charuau, M.-L. Bosse & G. Bailly (2024) A karaoke-based game to improve the ability of French primary school pupils in planning pauses and breathing while reading aloud, 31st Annual Conference Society for the Scientific Study of Reading (SSSR), Copenhague, Denmark. (HAL)
Younès, R., F. Elisei, D. Pellier & G. Bailly (2024) Impact of verbal instructions and deictic gestures of a cobot on the performance of human coworkers, IEEE-RAS International Conference on Humanoid Robots (Humanoids), pp. 1025--1032, Nancy, France. (HAL)
Ringeval, F., B. Schuller, G. Bailly, S. Azzakhnini & H. Fournier (2024) EVAC 2024 — Empathic Virtual Agent Challenge: Appraisal-based Recognition of Affective States, International Conference on Multimodal Interaction (ICMI), pp. 677--683, Costa Rica. DOI.
2023
Lenglet M., O. Perrotin & G. Bailly (2023) The GIPSA-Lab Text-To-Speech System for the Blizzard Challenge 2023. 18th Blizzard Challenge Workshop, pp. 34-39, Grenoble, France, DOI. (HAL)
Perrotin O., B. Stephenson, S. Gerber & G. Bailly (2023) The Blizzard Challenge 2023. 18th Blizzard Challenge Workshop, Grenoble, France. pp. 1-27, DOI. (HAL)
Bailly G., M. Lenglet, O. Perrotin & E. Klabbers (2023) Advocating for text input in multi-speaker text-to-speech systems, Speech Synthesis Workshop (SSW), pp. 1-7, Grenoble, France. DOI. (HAL)
Lenglet M., O. Perrotin & G. Bailly (2023) Local Style Tokens: Fine-Grained Prosodic Representations for TTS Expressive Control, Speech Synthesis Workshop (SSW), pp. 120-126, Grenoble, France. DOI. (HAL)
Haefflinger L., F. Elisei, S. Gerber, B. Bouchot, J.-P. Vigne & G. Bailly (2023) Data-driven Generation of Eyes and Head Movements of a Social Robot in Multiparty Conversation, International Conference on Social Robotics (ICSR), pp. 191-203, Doha, Quatar. DOI. (HAL)
Haefflinger L., F. Elisei, B. Bouchot, J.-P. Vigne & G. Bailly (2023) On the benefit of independent control of head and eye movements of a social robot for multiparty human-robot interaction, HCII, pp. 450-466, Copenhagen, Denmark. DOI. (HAL)
2022
Hajj M.-L., M. Lenglet, O. Perrotin & G. Bailly (2022) Comparing NLP solutions for the disambiguation of French heterophonic homographs for end-to-end TTS systems, Speech and Computer (SpeCom), pp. 265–278, Gurugram, India, In: Prasanna, S.R.M., Karpov, A., Samudravijaya, K., Agrawal, S.S. (eds) Lecture Notes in Computer Science, vol 13721. Springer. (DOI. HAL)
Lenglet M., O. Perrotin & G. Bailly (2022) Speaking Rate Control of end-to-end TTS Models by Direct Manipulation of the Encoder's Output Embeddings, Interspeech, pp. 11-15, Seoul, South Korea.
(DOI, HAL)
Younes, R., G. Bailly, F. Elisei & D. Pellier (2022) Automatic Verbal Depiction of a Brick Assembly for a Robot Instructing Humans, SIGDIAL, pp. 159-171, Edimburgh, UK.
(DOI, HAL)
Lenglet M., O. Perrotin & G. Bailly (2022) Modélisation de la Parole avec Tacotron2: Analyse acoustique et phonétique des plongements de caractère, Journées d'Etudes sur la Parole (JEP), pp. 845-854, Normoutiers, France. (DOI, HAL)
2021
Koelsch L., F. Elisei, L. Ferrand, P. Chausse, G. Bailly & P. Huguet (2021) Impact of social presence of humanoid robots: does competence matter? International Conference on Social Robotics (ICSR), pp. 729-739, Singapore. (HAL) [BEST PAPER AWARD]
Perrotin O., A., Hussein, G. Bailly & T. Hueber (2021) Evaluating the extrapolation capabilities of neural vocoders to extreme pitch values, Interspeech, pp. 11-15, Brno, Czech Republic. DOI. (HAL)
Tarpin-Bernard F., J. Fruitet, J.P. Vigne, P. Constant, H. Chainay, O. Koenig, F. Ringeval, B. Bouchot, G. Bailly, F. Portet, S. Alisamir, Y. Zhou, J. Serre, V. Delerue, H. Fournier, K. Berenger, I. Zsoldos, O. Perrotin, F. Elisei, M. Lenglet, C. Puaux, L. Pacheco, M. Fouillen & D. Ghenassia (2021) THERADIA: digital therapies augmented by Artificial Intelligence, International Conference on Applied Human Factors and Ergonomics (AHFE), pp. 478–485, New-York, USA. ()DOI. HAL)
Godde E., M.-L. Bosse & G. Bailly (2021) Causal links between comprehension and fluency dimensions including prosody from grade 2 to 4, Society for the Scientific Study of Reading Conference, Lancaster, UK.
Mandin S., A. Zaher, S. Meyer, M. Loiseau, G. Bailly, C. Payre-Ficout, J. Diard, Fluence-Group & S. Valdois (2021) Expérimentation à grande échelle d'applications pour tablettes pour favoriser l'apprentissage de la lecture et de l'anglais, Conférence sur les Environnements Informatiques pour l'Apprentissage Humain (EIAH), Friburg, Switzerland. (HAL)
Godde E., G. Bailly, A.-L. Piat-Marchand & M.-L. Bosse (2021) Suivi longitudinal de la fluence en lecture par évaluation automatique, Conférence sur les Environnements Informatiques pour l'Apprentissage Humain (EIAH), Friburg, Switzerland. (HAL)
Lenglet M., O. Perrotin & G. Bailly (2021) Impact of segmentation and annotation in French end-to-end synthesis, Speech Synthesis Workshop (SSW), pp. 13--18, Budapest, Hungary. DOI. (HAL)
2020
Bailly, G., E. Godde, A.-L. Piat-Marchand & M.-L. Bosse (2020) Predicting multidimensional subjective ratings of children' readings from the speech signals for the automatic assessment of fluency, International Conference on Language Resources and Evaluation (LREC), pp. 317-322, Marseille, France. (HAL)
Bailly, G. & F. Elisei (2020) Speech in action: designing challenges that require incremental processing of self and others' speech and performative gestures, Workshop on Natural Language Generation for Human-Robot Interaction at Human-Robot Interaction (NLG4HRI 2020), Dublin (virtual), Ireland.
(HAL)
2019
Mohammed, O., G. Bailly & D. Pellier (2019) Style transfer and extraction for the handwritten letters using deep learning,International Conference on Agents and Artificial Intelligence (ICAART), Prague, Czech Republic, pp. 677-684.
(HAL)
Godde E., G.
Bailly & M.-L. Bosse (2019) Reading prosody development: automatic assessment for a longitudinal study, Speech &
Language Technology for Education (SLaTE), pp. 104--108, Graz, Austria.
(DOI, HAL)
2018
Bailly, G. & F. Elisei (2018)
Demonstrating and learning multimodal socio-communicative behaviors for HRI: building interactive models from immersive teleoperation data, AI-MHRI: AI for
Multimodal Human Robot Interaction Workshop at the Federated AI Meeting (FAIM), pp. 39-43, Stockholm - Sweden.
(DOI. HAL)
Nguyen, D.-C., G. Bailly & F. Elisei (2018)
Comparing cascaded LSTM architectures for generating gaze-aware head motion from speech in HAI task-oriented dialogs, HCI
International, pp. 164-175, Las Vegas, USA:. (DOI. HAL)
Cambuzat, R., Elisei, F., Bailly, G., Simonin, O. & Spalanzani, A. (2018)
Immersive teleoperation of the eye gaze of social robots, International Symposium on Robotics (ISR), Munich, Germany: pp. 232-239.
(HAL)
Gerazov, B. & G.
Bailly (2018) PySFC - A system for
prosody analysis based on the Superposition of Functional Contours
prosody model, Speech
Prosody, Poznań, Poland: pp. 774-778.
(DOI.HAL)
Gerazov, B., G.
Bailly & Y. Xu (2018) The
significance of scope in modelling tones in Chinese, International Symposium
on Tonal Aspects of Languages (TAL), Berlin,
Germany: pp. 183-187. DOI.
(HAL)
Gerazov, B., G.
Bailly
& Y. Xu (2018) A Weighted Superposition of
Functional
Contours model for modelling contextual prominence of elementary
prosodic contours, Interspeech,
Hyderabad, India: pp. 2524-2528. DOI.
(HAL)
Gerazov, B., G.
Bailly, O. Mohammed & Y. Xu (2018) Embedding
Context-Dependent
Variations of Prosodic Contours using Variational Encoding for
Decomposing the Structure of Speech Prosody, Workshop on Prosody and
Meaning: Information Structure and Beyond,
Aix-en-Provence, France. (ARXIV)
Mohammed, O., G.
Bailly and D. Pellier (2018) Handwriting
styles: benchmarks and evaluation metrics, First
International Workshop on Deep and Transfer Learning at the International
Conference on Social Networks Analysis, Management and Security (SNAMS),
Valencia, Spain: pp 159-166. DOI.
(HAL)
Nguyen, V. Q., L. Girin, G. Bailly, F. Elisei
&
D.-C. Nguyen (2018) Autonomous sensorimotor learning for sound source
localization by a humanoid robot, Workshop on Crossmodal
Learning for Intelligent
Robotics in conjunction with IEEE/RSJ IROS 2018, Madrid,
Spain. (HAL)
Gerazov, B., G. Bailly, O. Mohammed, Y. Xu & P. Garner (2018) A Variational Prosody Model for the decomposition and synthesis of speech prosody, Conference on Neural Information Processing Systems (NIPS), Montréal, Canada: submitted. (ARXIV)
2017
Mohammed O., G. Bailly & D.
Pellier
(2017)
Acquiring human-robot interaction skills with transfer learning
techniques, Human-Robot
Interaction
(HRI) Pionner Workshop, Vienna,
Austria: pp. 359-360. (HAL)
Godde E., G.
Bailly,
D. Escudero, M.-L. Bosse, M. Bianco & C. Vilain (2017)
Improving
fluency of young readers: introducing a Karaoke to learn how to breath
during a Reading-while-Listening task, Speech &
Language
Technology
for Education (SLaTE), Stockholm, Sweden: pp.
127-131. DOI.
(HAL)
Nguyen,
D.-C., G. Bailly
& F. Elisei (2017) An
evaluation framework to assess and correct the multimodal behavior of a
humanoid robot in human-robot interaction, Gesture in Interaction
(GESPIN), Posznan, Poland: pp. 56-62. (HAL)
Cambuzat R., G. Bailly & F.
Elisei
(2017)
Gaze contingent control of vergence, yaw and pitch of robotic eyes for
immersive telepresence, European
Conf. on Eye
Movements (ECEM), Wuppertal, Germany. (HAL)
Godde E., G.
Bailly,
D. Escudero, M.-L. Bosse & E.
Gillet-Perret (2017) Evaluation of reading performance of primary
school children: objective measurements vs. subjective
ratings, Workshop
on Child Computer
Interaction (WOCCI), Glasgow,
Scotland: pp. 23-27. DOI.
(HAL)
2016
Pouget M., O. Nahorna, T. Hueber & G.
Bailly (2016)
Adaptive latency for part-of-speech tagging in incremental
text-to-speech synthesis. Interspeech,
San Francisco, CA: pp. 2846-2850. DOI.
(HAL)
Bailly G.,
F. Elisei,
A. Juphard & O. Moreau (2016) Quantitative analysis of
backchannels uttered by an interviewer during neuropsychological tests.
Interspeech,
San Francisco, CA: pp. 2905-2909. DOI.
(HAL)
Barbulescu A., R. Ronfard & G. Bailly
(2016) Characterization of audiovisual dramatic attitudes. Interspeech,
San Francisco, CA: 585-589. DOI.
(HAL)
Nguyen, D.-C., G. Bailly & F. Elisei (2016) Conducting
neuropsychological tests with a humanoid robot: design and evaluation. IEEE Int. Conf. on
Cognitive Infocommunications (CogInfoCom), Wroclaw,
Poland, pp. 337-342. (HAL)
[BEST PAPER AWARD]
2015
Foerster F., G. Bailly & F.
Elisei
(2015) Impact of iris size and eyelids coupling on the estimation of
the gaze direction of a robotic talking head by human viewers. Humanoids,
Seoul, Korea: pp.148-153. (HAL)
Pouget, M., T. Hueber, G. Bailly and T. Bauman (2015). HMM training strategy for incremental speech synthesis. Interspeech, Dresden, Germany: pp.1201-1205. (HAL)
Gerbier, E., G. Bailly & M.-L. Bosse (2015). Using Karaoke to
enhance
reading while listening: impact on word memorization and eye movements.
Workshop
on
Speech and Language Technology for Education (SLaTE), Leipzig,
Germany: pp.59-64. (HAL)
Bailly G.,
F. Elisei & M. Sauze (2015). Beaming the gaze of a humanoid
robot. Human-Robot
Interaction
(HRI) Late Breaking Reports, Portland, OR:
pp.47-49. (HAL)
Bailly G.,
A. Mihoub, C.
Wolf & F. Elisei (2015). Learning joint multimodal behaviors
for
face-to-face interaction: performance & properties of
statistical
models. Human-Robot
Interaction (HRI). Workshop on behavior
coordination between animals, humans and robots, Portland,
OR. (HAL)
Barbulescu, A., G.
Bailly, R. Ronfard & Maël Pouget (2015)
Audiovisual generation of social attitudes from neutral stimuli, Auditory-Visual
Speech Processing (AVSP), Vienna, Austria: pp.34-39.
(HAL)
Guillermo G., C. Plasson, F. Elisei, F.
Noël & G.
Bailly (2015) Qualitative assesment of a beaming environment
for
collaborative professional activities, European conference for
Virtual Reality and Augmented Reality (EuroVR), Milano,
Italy. (HAL)
2014
Mihoub, A., G.
Bailly & C. Wolf (2014). Modelling
perception-action loops: comparing
sequential models with frame-based classifiers. Human-Agent Interaction
(HAI), Tsukuba, Japan: pp.309-314. (HAL)
Barbulescu, A., R. Ronfard, G. Bailly G.
Gagneré & H.
Cakmak (2014). Beyond basic emotions: expressive virtual actors with
social attitudes. ACM/SIGGRAPH
Conference on Motion in Games (MIG), Los Angeles, CA:
pp.39-47. (HAL)
Parmiggiani, A., M. Randazzo, M.
Maggiali, F. Elisei, G.
Bailly & G.
Metta (2014). An articulated talking face for the iCub. Humanoids,
Madrid: pp.1-6. (HAL)
Bailly G.
&
A. Martin (2014). Assessing objective
characterizations of phonetic
convergence. Interspeech,
Singapour: pp.2011-2015. (HAL)
Rochet-Capellan, A., G.
Bailly & S. Fuchs (2014). Is breathing sensitive
to the
communication partner? Speech
Prosody, Dublin: pp.613-618. (HAL)
2013
Bailly G.,
A. Rochet-Capellan & C. Vilain (2013). Adaptation of
respiratory
patterns in collaborative reading. Interspeech,
Lyon - France: pp.1653-1657. (HAL)
Hueber, T., G.
Bailly, P. Badin & F. Elisei (2013). Speaker
adaptation of an acoustic-articulatory inversion model using cascaded
Gaussian mixture regressions. Interspeech,
Lyon - France: pp.2753-2757. (HAL)
Barbulescu, A., T. Hueber, G. Bailly &
R.
Ronfard (2013). Audiovisual speaker conversion using prosodic features.
Auditory-Visual
Speech Processing (AVSP), St Jorioz - France: pp.11-16. (HAL)
Mihoub, A., G.
Bailly &
C. Wolf (2013). Social behavior modeling based on Incremental Discrete
Hidden Markov Models. International
Workshop on Human Behavior
Understanding (HBU), Barcelona - Spain: pp.172-183. (HAL)
2012
Bailly G. &
C. Gouvernayre (2012). Pauses and respiratory markers of the
structure of book reading. Interspeech.
Portland, OR: pp.2218-2221. (HAL)
Hueber, T., A. Ben Youssef, G. Bailly,
P. Badin & F. Elisei (2012). Cross-speaker
acoustic-to-articulatory inversion using phone-based trajectory HMM for
pronunciation training. Interspeech. Portland, OR: pp.783-786. (HAL)
Hueber, T., G.
Bailly &
B. Denby (2012). Continuous articulatory-to-acoustic mapping using
phone-based trajectory HMM for a silent speech interface. Interspeech.
Portland: pp.723-726. (HAL)
Lelong, A. & G.
Bailly (2012). Original objective and subjective
characterization of phonetic convergence. International Symposium
on Imitation and Convergence in Speech. Aix-en-Provence,
France. (HAL)
Lelong, A. & G.
Bailly (2012). Characterizing phonetic convergence with
speaker recognition techniques. Listening Talker Workshop.
Edinburgh, UK: pp.28-31. (HAL;
.pdf)
Hueber, T., A. Ben Youssef, P. Badin, G. Bailly &
F.
Elisei (2012). Vizart3D : retour articulatoire visuel pour
l’aide à la prononciation. Journées
d'Etudes sur la Parole (JEP). Grenoble - France:
pp.17-18. (HAL)
2011
Ben Youssef, A., T. Hueber, P. Badin
& G. Bailly
(2011).
Toward a multi-speaker visual articulatory feedback system. Interspeech.
Florence: pp.589-592. (HAL;
.pdf)
Bailly G. &
W. Barbour (2011). Synchronous reading:
learning French
orthography
by audiovisual training. Interspeech.
Florence: pp.1153-1156. (HAL; .pdf)
Ben Youssef, A, P. Badin & G. Bailly (2011).
Improvement of HMM-based acoustic-to-articulatory speech inversion. International Seminar on
Speech Production (ISSP). Montréal, CA. (HAL)
Hueber, T., P. Badin, C. Savariaux, C.
Vilain & G.
Bailly
(2011). Differences in articulatory strategies between silent,
whispered and normal speech? A pilot study using electromagnetic
articulography. International
Seminar on Speech Production (ISSP). Montréal,
CA. (HAL; .pdf)
Ben Youssef, A., T. Hueber, P. Badin, G. Bailly &
F.
Elisei (2011). Toward a speaker-independent visual articulatory
feedback system. International
Seminar on Speech Production (ISSP).
Montréal. (HAL)
Hueber, T., P. Badin, G.
Bailly,
A. Ben Youssef, F. Elisei, B. Denby & G. Chollet (2011).
Statistical
mapping between articulatory and acoustic features. Application to
silent speech Interface and visual articulatory feedback. International Workshop
on Performative Speech and Singing Synthesis.
Vancouver. (HAL;
.pdf).
Hueber, T., A. Ben youssef, P. Badin, G. Bailly &
F.
Elisei
(2011). Articulatory-to-acoustic mapping: application to silent speech
interface and visual articulatory feedback. Pan European Voice
Conference. Marseille, France. (HAL; .pdf)
2010
Fagel, S. & G.
Bailly
(2010). On the importance of eye gaze in a face-to-face
collaborative task. ACM Workshop on
Affective Interaction in Natural
Environments (AFFINE). Firenze, Italy, p. 81-85. (HAL;
.pdf).
Boucher, J.-D., J. Ventre-Dominey, P. F.
Dominey, G. Bailly
& S. Fagel (2010). Facilitative effects of communicative
gaze
and speech in human-robot cooperation. ACM Workshop on
Affective
Interaction in Natural Environments (AFFINE). Firenze,
Italy, p. 71-74. (HAL; .pdf).
Badin, P., A. Ben Youssef, G. Bailly,
F. Elisei & T. Hueber (2010). Visual articulatory feedback for
phonetic correction in second language learning. Second Language
Studies: Acquisition, Learning, Education and Technology.
Tokyo. (HAL;
.pdf).
Ben Youssef, A., P. Badin and G. Bailly
(2010). Acoustic-to-articulatory inversion in speech based on
statistical models. Auditory-Visual
Speech Processing (AVSP). Hakone,
Japan, p. 160-165. (HAL;
.pdf).
Bailly G. &
A. Lelong (2010). Speech dominoes and
phonetic convergence. Interspeech.
Tokyo, p.1153-1156. (HAL;
.pdf).
Ben Youssef, A., P. Badin & G. Bailly (2010).
Face-to-tongue articulatory inversion based on statistical models. Interspeech.
Tokyo, p. 2002-2005. (HAL;
.pdf).
Heracleous, P., P. Badin, G. Bailly &
N.
Hagita (2010). Robust speech recognition based on multimodal fusion. International
Conference on Multimedia & Expo (ICME). Singapore,
p.568-572. (.pdf).
Ben Youssef, A., P. Badin & G. Bailly (2010).
Méthodes basées sur les HMMs et les GMMs pour
l'inversion
acoustico-articulatoire en parole. Journées
d'Etudes sur la
Parole. Mons, Belgium, p.249-252. (.pdf).
2009
Ben Youssef, A., P. Badin, G. Bailly &
P. Heracleous (2009).
Acoustic-to-articulatory inversion using speech recognition and
trajectory formation based on phoneme hidden Markov models. Interspeech.
Brighton, p.2255-2258. (.pdf).
Tran, V.-A., G.
Bailly, H. Loevenbruck & T. Toda (2009).
Multimodal
HMM-based NAM-to-speech conversion. Interspeech.
Brighton, p.656-659. (.pdf).
2008
Bailly G.,
A.
Bégault, F. Elisei & P. Badin
(2008). Speaking with
smile or disgust: data and models. Auditory-Visual
Speech Processing (AVSP). Tangalooma,
Australia, p.111-116. (.pdf).
Bailly G.,
Y.
Fang, F. Elisei & D. Beautemps (2008). Retargeting cued speech
hand gestures for different talking heads and speakers. Auditory-Visual
Speech Processing (AVSP). Tangalooma,
Australia, p.153-158. (.pdf).
Fagel,
S. & G. Bailly
(2008). From 3-D speaker cloning to
text-to-audiovisual speech. Auditory-Visual Speech
Processing (AVSP), Tangalooma -
Australia, p.43-46. (.pdf).
Bailly G.,
O.
Govokhina, G. Breton, F. Elisei & C. Savariaux (2008). The
trainable
trajectory formation model TD-HMM parameterized for the LIPS 2008
challenge. Interspeech.
Brisbane, Australia, p.2318-2321. (.pdf).
Theobald,
B.-J., S. Fagel, G. Bailly & F. Elisei
(2008). LIPS2008: Visual
speech synthesis challenge. Interspeech,
Brisbane, Australia, p.2310-2313. (.pdf).
Fagel,
S., F.
Elisei & G. Bailly (2008). From 3-D speaker
cloning to
text-to-audiovisual-speech. Interspeech,
Brisbane, Australia, p.2325. (.pdf).
Badin,
P., Y.
Tarabalka, F. Elisei & G. Bailly (2008).
Can you “read
tongue movements”? Interspeech,
Brisbane, Australia, p.2635-2637. (HAL;.pdf).
Tran, V.-A., G. Bailly,
H. Loevenbruck
& C. Jutten (2008).
Improvement to a NAM
captured whisper-to-speech system. Interspeech.
Brisbane,
Australia, p.1465-1468. (.pdf).
Tran, V.-A., G. Bailly,
H.
Lovenbrück & C. Jutten (2008).
Amélioration de
système de conversion de voix inaudible vers la voix
audible. Journées
d'Etude sur la Parole (JEP). Avignon, France. (.pdf).
Tran,
V.-A., G. Bailly,
H. Loevenbruck & T. Toda (2008).
Predicting F0 and voicing from NAM-captured whispered speech. Speech
Prosody.
Campinas, Brazil, p.107-110. (.pdf).
Bailly G. &
A.
Bartroli (2008). Generating Spanish
intonation
with a trainable prosodic model. Speech Prosody.
Campinas - Brazil. p.63-66. (.pdf).
Bérar,
M., M. Desvignes & G.
Bailly (2008). Reconstruction faciale 3D
à partir d’images 3D. RFIA.
Amiens,
France.
2007
Raidt, S., G.
Bailly & F. Elisei (2007). Gaze patterns
during face-to-face interaction.IEEE/WIC/ACM
International Conferences on Web Intelligence and Intelligent Agent
Technology - Workshop on Communication between Human and Artificial
Agents (CHAA). Fremont, CA. p.338-341. (.pdf)
Elisei, F., G.
Bailly & A. Casari (2007). Towards
eyegaze-aware
analysis and synthesis of audiovisual speech. Auditory-visual
Speech
Processing, Hilvarenbeek, The Netherlands. p.120-125.
(HAL;.pdf)
Fagel, S., G.
Bailly & F. Elisei (2007). Intelligibility
of
natural and 3D-cloned German speech. Auditory-visual Speech
Processing,
Hilvarenbeek, The Netherlands. p.126-131. (.pdf)
Raidt, S., G.
Bailly & F. Elisei (2007). Mutual gaze
during
face-to-face interaction. Auditory-visual
Speech Processing,
Hilvarenbeek, The Netherlands. (.pdf)
Raidt, S., G.
Bailly & F. Elisei (2007). Analyzing and modeling gaze during face-to-face interaction. 7th International Conference on Intelligent Virtual Agents, Paris. p.403-404. ( .pdf)
Raidt S., G. Bailly & F. Elisei (2007). Impact of cognitive state on gaze patterns during face-to-face interaction. 14th European Conference on Eye Movements, Potsdam, Germany.
Picot,
A., G. Bailly,
F. Elisei & S. Raidt (2007).
Scrutinizing natural scenes: controlling the gaze of an embodied
conversational agent. 7th
International Conference on Intelligent Virtual Agents,
Paris. p.272-282.(.pdf)
Govokhina, O., G.
Bailly & G. Breton (2007). Learning optimal
audiovisual
phasing for a HMM-based control model for facial animation. ISCA
Speech Synthesis
Workshop. Bonn, Germany.( .pdf)
Tarabalka,
Y., P. Badin, F. Elisei & G.
Bailly (2007).
Peut-on
lire sur la langue? Évaluation de l'apport de la vision de
la
langue à la compréhension de la parole. Journées
de
Phonétique Clinique. Grenoble, France.
Tarabalka,
Y.,
P. Badin, F. Elisei & G.
Bailly (2007).
Can you “read
tongue movements”? Evaluation of the contribution of tongue
display to speech understanding.
Conférence Internationale
sur l'Accessibilité et les systèmes de
suppléance aux
personnes en situation de Handicaps (ASSISTH), Toulouse -
France.
(HAL;.pdf)
Beautemps,
D., L. Girin, N. Aboutabit, G.
Bailly, L. Besacier, G. Breton, T.
Burger, A. Caplier, M.-A. Cathiard, D. Chêne, J. Clarke, F.
Elisei, O. Govokhina, M. Marthouret, S. Mancini, Y. Mathieu, P. Perret,
B. Rivet, P. Sacher, C. Savariaux, S. Schmerber, J.-F.
Sérignat,
M. Tribout & S. Vidal (2007). TELMA :
Téléphonie
à l'usage des malentendants. Des modèles aux
tests
d'usage. Conférence Internationale sur
l'Accessibilité et
les systèmes de suppléance aux personnes en
situation de
Handicaps (ASSISTH), Toulouse - France. (HAL;.pdf).
2006
Govokhina,
O., G. Bailly G.
Breton & P. Bagshaw (2006) A new trainable
trajectory formation system for facial animation. ISCA Workshop on
Experimental Linguistics. Athens, Greece: 25-32. (.pdf)
Gibert, G., G. Bailly & F. Elisei (2006) Evaluation of a virtual speech cuer. ISCA Workshop on Experimental Linguistics. Athens, Greece: 141-144. (.pdf)
Bailly G.,
F. Elisei, S.
Raidt, A. Casari & A. Picot
(2006). Embodied
conversational agents : computing and rendering realistic gaze
patterns. LNCS
4261: Pacific Rim Conference on Multimedia Processing,
Hangzhou, China, 9-18. (HAL; .pdf)
Govokhina, O., G. Bailly G. Breton & P. Bagshaw (2006). TDA: A new trainable trajectory formation system for facial animation. InterSpeech, Pittsburgh, PE: 2474-2477. (.pdf)
Bailly G. & I. Gorisch (2006). Generating German intonation with a trainable prosodic model. InterSpeech, Pittsburgh, PE: 2366-2369. (.pdf)
Gacon,
P., P. Y. Coulon & G.
Bailly (2006). Audiovisual speech enhancement
experiments for mouth segmentation evaluation. EUSIPCO,
Pisa, Italy. (.pdf)
Gibert,
G., G. Bailly
& F. Elisei (2006). Evaluating a virtual speech cuer. InterSpeech,
Pittsburgh, PE:
2430-2433. (.pdf)
Bailly G.,
Baras, C., Bas, P., Baudry, S., Beautemps, D., Brun, R., Chassery,
J.-M., Davoine, F., Elisei, F., Gibert, G., Girin, L., Grison, D.,
Léoni, J.-P., Liénard, J., Moreau, N., and
Nguyen, P.
(2006) ARTUS : calcul et tatouage audiovisuel des mouvements d'un
personnage animé virtuel pour l'accessibilité
d'émissions télévisuelles aux
téléspectateurs sourds comprenant la Langue
Française Parlée Complétée.
Handicap.
Paris,
265-270. (.pdf)
Bailly G.,
F.
Elisei, P. Badin & C. Savariaux (2006). Degrees of freedom of
facial
movements in face-to-face conversational speech. International
Workshop on Multimodal Corpora,
Genoa - Italy: 33-36. (.pdf)
Boula de Mareüil, P., C.
d'Alessandro, A. Raake, G.
Bailly, M.-N.
Garcia
& M. Morel (2006). A joint intelligibility evaluation of French
text-to-speech systems: the EvaSy SUS/ACR campaign. Language
Ressources and
Evaluation Conference (LREC), Genova - Italy:
2034-2037. (.pdf)
Garcia,
M.-N., C. d’Alessandro, P. Boula de Mareüil, A.
Raake, G.
Bailly & M. Morel (2006). A joint intelligibility
evaluation of
French text to speech systems: the EVASY/SUS campaign. Language Ressources and
Evaluation Conference (LREC), Genova - Italy:
307-310. (.pdf)
Raidt,
S., G.
Bailly & F. Elisei (2006). Does a virtual talking
face generate
proper multimodal cues to draw user's attention towards interest
points? Language
Ressources and Evaluation Conference (LREC), Genova -
Italy: 2544-2549.(.pdf)
Govokhina,
O., G. Bailly G.
Breton & P. Bagshaw (2006). Evaluation de
systèmes de génération de mouvements
faciaux. Journées
d'Etudes sur la Parole, Rennes - France: 305-308. (.pdf)
Gibert,
G., G.
Bailly & F. Elisei (2006). Evaluation d'un
système de
synthèse 3D de Langue française Parlée
Complétée. Journées
d'Etudes sur la Parole, Rennes - France: 495-498. (.pdf)
2005
Raidt, S., F. Elisei & G. Bailly
(2005) Face-to-face interaction with a conversationnal agent: eye-gaze
and deixis. in International Conference on Autonomous Agents
and Multiagent Systems. Utrecht University, The Netherlands.
(.pdf)
Raidt, S., F. Elisei, F. & G. Bailly (2005)
Basic components of a face-to-face interaction
with a conversational agent: multual attention and deixis. in Smart
Objects and Ambient Intelligence, Grenoble-France: 247-252.
(.pdf)
Raidt, S., G.
Bailly & F. Elisei (2005). Eye gaze in
face-to-face interaction with a talking
head. European Conference on Eye Movements, Bern,
Switzerland.
Bailly G.,
F. Elisei & S. Raidt (2005) Multimodal face-to-face interaction
with a talking face: mutual attention and deixis. in Human-Computer
Interaction International. Las Vegas.
(.pdf)
Bérar,
M., M. Desvignes, G.
Bailly & Y. Payan (2005). Statistical skull
models from 3D X-ray images. in International Conference on
Reconstruction
of Soft Facial Parts, Remagen, Germany.
(.pdf)
Bérar,
M.,
M. Desvignes, M., G.
Bailly & Y. Payan (2005) 3D statistical facial
reconstruction. in International Symposium on Image and
Signal Processing and Analysis. Zagreb, Croatia. (.pdf)
Bérar, M., M. Desvignes, M., G. Bailly & Y. Payan (2005) Missing data estimation
using polynomial kernels. in International Conference on
Advances in Pattern Recognition. Bath, UK. (.pdf)
Bérar, M.,
M. Desvignes, G. Bailly
& Y. Payan (2005) Reconstruction par
noyaux polynomiaux. in GRETSI. Louvain-la-neuve,
Belgique. (.pdf)
Gacon,
P.,
P. Y. Coulon & G.
Bailly (2005) Statistical active model for mouth
components segmentation. in International Conference on
Acoustics, Speech, and Signal Processing. Philadelphia,
PA: 1021-1024.
(.pdf)
Gacon, P., P. Y. Coulon & G. Bailly (2005). Modèle statistique et
description locale d'apparence pour la détection
des contours des lèvres. GRETSI,
Louvain-la-Neuve, Belgique. (.pdf)
Gacon, P., Coulon, P.Y. & G. Bailly (2005)
Non-linear active model for mouth inner and outer
contours detection. in EUSIPCO. Antalya,
Turkey. (.pdf)
Elisei,
F., G. Bailly G.
Gibert & R. Brun (2005) Capturing data and
realistic 3D models for cued speech analysis and audiovisual synthesis.
in Auditory-Visual Speech Processing Workshop.
Vancouver, Canada. (.pdf)
Boula de
Mareüil, P., C. d'Alessandro, G. Bailly, F.
Béchet, M.-N. Garcia, M. Morel, R. Prudon &
J. Véronis (2005).
Evaluation de la prononciation des noms propres par quatre
convertisseurs
graphème-phonème en français. Colloque
Traitement lexicographique des noms
propres, Tours - France.
Boula de
Mareüil, P., C. d'Alessandro, G. Bailly, F.
Béchet, M.-N. Garcia, M. Morel, R. Prudon &
J. Véronis (2005).
Evaluating the pronunciation of proper names by four French
grapheme-to-phoneme
converters. InterSpeech, Lisbon, Portugal: 1521-1524. (.pdf)
2004
Raidt,
S., G. Bailly,
B. Holm & H. Mixdorff (2004) Automatic generation
of prosody: comparing two superpositional systems. in International
Conference on Speech Prosody. Nara, Japan: 417-420. (.pdf)
Berar,
M., M. Desvignes, G.
Bailly & Y. Payan (2004) 3D Meshes
Registration : Application to statistical skull model. in International
Conference on Image Analysis and Recognition. Porto -
Portugal: 100-107. .pdf)
Odisio,
M. & G. Bailly
(2004) Audiovisual perceptual evaluation of
resynthesised speech movements. in International Conference
on Speech and Language Processing. Jeju, Korea:
2029-2032. (.pdf)
Bailly G., Holm, B. &
Aubergé, V.
(2004) A trainable prosodic model: learning the contours implementing
communicative functions within a superpositional model of
intonation. International Conference on Speech and Language
Processing.
Jeju, Korea.
p. 1425-1428. (.pdf)
Chen, G.-P., G.
Bailly, Q.-F. Liu & R.-H.
Wang, (2004)
A superposed prosodic model for Chinese text-to-speech synthesis. in International
Conference of Chinese Spoken Language Processing.
p.177-180. (.pdf)
Gibert, G., G.
Bailly, F. Elisei, D. Beautemps & R.
Brun
(2004). Audiovisual
text-to-cued speech synthesis. Eusipco, Vienna -
Austria: 1007-1010. (.pdf)
Gibert, G., G.
Bailly & F. Elisei
(2004). Audovisual text-to-cued speech synthesis. 5th Speech
Synthesis Workshop, Pittsburgh, PA:
85-90. (.pdf)
Gibert, G., G.
Bailly, F. Elisei, D. Beautemps & R.
Brun
(2004).
Evaluation of a speech cuer: from motion capture to a concatenative
text-to-cued speech system. Language Ressources and
Evaluation Conference (LREC),
Lisbon, Portugal: 2123-2126. (.pdf)
Gibert, G., G.
Bailly, F. Elisei, D. Beautemps & R.
Brun
(2004).
Mise en oeuvre d'un synthétiseur 3D de Langage
Parlé
Complété. Journées d'Etudes
sur la Parole,
Fès, Maroc: 245-248. (.pdf)
Gacon, P.,
P.-Y. Coulon & G.
Bailly (2004).
Shape and sampled-based appearance model for mouth components
segmentation. International Workshop on Image Analysis for
Multimedia
Interactive Services, Lisbon. (.pdf)
2003
Odisio, M. & G.
Bailly (2003).
Shape and appearance models of talking faces for model-based tracking. Audio
Visual Speech Processing, St Jorioz,
France: 105-110. (.pdf)
Odisio, M. & G.
Bailly (2003).
Shape and appearance models of talking faces for model-based tracking. International
Conference on Computer Vision,
Nice - France: 143-148.
(.pdf)
Bérar,
M., G. Bailly,
M. Chabanas, F. Elisei, M. Odisio & Y. Pahan (2003).
Towards a generic talking head. 6th International Seminar on
Speech Production, Sydney- Australia: 7-12. (.pdf)
Bailly G.,
N.
Campbell & B. Mobius (2003).
ISCA Special Session: Hot Topics in Speech Synthesis. EuroSpeech,
Geneva, Switzerland: 37-40. (DOI)
Bailly G.,
F. Elisei, M. Odisio, D. Pelé & K. Grein-Cochard
(2003).
Objects and agents for MPEG-4 compliant scalable face-to-face
telecommunication. Smart Object Conference,
Grenoble - France: 204-207. (.pdf)
Badin, P., G.
Bailly, F. Elisei & M.
Odisio (2003). Virtual talking heads and audiovisual articulatory
synthesis. International Congress on Phonetic Sciences,
Barcelona: 193-197. (.pdf)
2002
Holm,
B. & G. Bailly
(2002). Learning the hidden structure of intonation:
implementing various functions of prosody. Speech Prosody,
Aix-en-Provence, France:
399-402. (.pdf)
Bailly G.,
G. Gibert
& M. Odisio (2002). Evaluation of
movement generation
systems using the point-light technique. IEEE Workshop on
Speech Synthesis, Santa Monica, CA: 27-30. (DOI)
Bailly G. (2002).
Audiovisual speech synthesis. From ground truth
to models. International
Conference on Speech and Language Processing, Boulder -
Colorado: 1453-1456. (DOI)
Bailly G. &
P.
Badin (2002). Seeing
tongue movements from outside. International Conference on
Speech
and Language Processing, Boulder - Colorado: 1913-1916. (.pdf)
Bailly G. &
B. Holm
(2002). Learning the hidden structure
of speech: from
communicative functions to prosody. Symposium on Prosody and
Speech Processing, Tokyo, Japan: 113-118. (DOI)
2001
Elisei,
F., M. Odisio, G. Bailly
& P. Badin (2001). Creating and controlling
video-realistic talking heads. Auditory-Visual Speech
Processing Workshop, Scheelsminde, Denmark: 90-97. (.ps)
Bailly G. (2001).
Audiovisual speech synthesis. ETRW on
Speech
Synthesis, Perthshire - Scotland: 1-10.
Bailly G. (2001).
Close shadowing natural vs synthetic speech. ETRW
on Speech Synthesis, Perthshire - Scotland: 87-90. (DOI, .pdf)
2000
Revéret,
L., G. Bailly
& P. Badin (2000). MOTHER: a new generation of talking
heads providing a flexible articulatory control for video-realistic
speech animation. International Conference on Speech and
Language Processing, Beijing - China: 755-758. (DOI)
Revéret,
L., G. Bailly,
P. Borel & P. Badin (2000). Analyse par
la synthèse d'un visage 3D parlant : inversion
opto-articulaire. Journées d'Etudes sur la Parole,
Aussois, France: 125-128.
Holm,
B. & G. Bailly
(2000). Generating prosody by superposing
multi-parametric overlapping contours. Proceedings of the
International Conference on Speech and Language Processing,
Beijing, China: 203-206. (DOI)
Holm, B. & G.
Bailly (2000).
Génération
de la prosodie par superposition de contours chevauchants: application
à l'énonciation de formules
mathématiques. Journées d'Etudes sur la
Parole, Aussois - France: 113-116. (.ps)
Borel, P., P. Badin, L.
Revéret & G.
Bailly (2000). Modélisation
articulatoire linéaire 3D d'un visage pour une
tête parlante virtuelle. Journées
d'Etude de la Parole,
Aussois, France:
121-124. (.ps)
Bailly G. (2000).
Evaluation des systèmes
d'analyse-modification-synthèse de parole. Journées
d'Etudes sur la Parole, Aussois - France: 109-112. (.ps)
Bailly G.,
E. R. Banga, A. Monaghan & E. Rank (2000). The
Cost258 Signal
Generation Test Array. Second International Conference on
Language Resources and Evaluation,
Athens - Greece: 651-654. (.pdf)
Badin P., P. Borel, G.
Bailly, L. Revéret, M.
Baciu & C.
Segebarth (2000). Towards an audiovisual virtual talking head: 3D
articulatory modeling of tongue, lips and face based on MRI and video
images. Proceedings of the 5th Speech Production Seminar,
Kloster Seeon - Germany: 261-264. (.ps)
1999
Morlec Y., G.
Bailly & V. Aubergé (1999).
Training an
application-dependent prosodic corpus
model and evaluation. Proceedings
of the European Conference on Speech Communication and Technology,
Budapest, Hungary: 1643-1646. (DOI)
Holm B., G.
Bailly & C. Laborde (1999)
Performance structures of mathematical
formulae. International Congress of Phonetic
Sciences, San Francisco, USA: 1297-1300. (.pdf)
Badin ,P., G.
Bailly & L.-J. Boé (1999).
Speech
production models and virtual
talking heads useful aids
for pronunciation training, InStill,
Besançon, France. (.ps)
Bailly G. (1999)
Accurate estimation of sinusoidal
parameters in an harmonic+noise
model for speech synthesis, Proceedings
of the European Conference on Speech
Communication and Technology, Budapest, Hungary: 1051-1054. (DOI
1998
Neagu, A. & G.
Bailly (1998)
Collaboration vs. competition between burst
transition cues for the
perception and identification of
French stops, Proceedings
of the International Conference on Speech and Language Processing,
Sydney, Australia: 2127-2130. (DOI)
Morlec, Y., A. Rilliard, G. Bailly &
V. Aubergé
(1998) Evaluating the adequacy
of synthetic prosody in
signaling syntactic boundaries:
methodology and first results, First
International Conference on Language Resources
and Evaluation, Granada, Spain. (.ps)
Bailly G.,
P. Badin & A. Vilain (1998) Contribution
de
la mâchoire à la géométrie
de la langue dans les modèles articulatoire statistiques. Journées
d'Etudes sur la Parole, Martigny, Suisse: 287-290.
(.ps)
Bailly G.,
P. Badin & A. Vilain (1998) Synergy
between jaw and lips/tongue
movements : consequences in
articulatory modelling, International Conference on Speech
and Language Processing,
Sydney, Australia: 417-420. (.pdf)
Badin, P., L. Pouchoy, G.
Bailly, M. Raybaudi, C. Segebarth, J.-F. Lebas, M. Tiede,
E. Vatikiotis-bateson & Y. Tohkura (1998) Un
modèle
articulatoire tridimensionnel du conduit vocal basé sur des
données IRM. Journées d'Etudes sur la
Parole, Martigny, Suisse: 283-286.
Badin, P., G.
Bailly, M. Raybaudi & C. Segebarth (1998) a
three-dimensional linear
articulatory model based on MRI data, International
Conference on Speech and Language Processing, Sydney,
Australia: 417-420. (.pdf)
Badin, P., G.
Bailly, M. Raybaudi & C. Segebarth (1998) a
three-dimensional linear
articulatory model based on MRI data, ESCA/COCOSDA
Workshop on Speech Synthesis, Jenolan Caves, Australia:
249-254.
Badin P., G.
Bailly & L.-J. Boé
(1998) towards the
use a virtual talking head
and of speech mapping tools
for pronunciation training, ESCA Tutorial
and
Research Workshop on Speech Technology in Language Learning,
Stockholm
– Sweden
1997
Neagu, A. & G.
Bailly
(1997) Relative contributions noise
burst and vocalic transitions
to the perceptual identification
of stop consonants, European Conference on Speech
Communication and Technology,
Rhodes - Greece: 2175-2178. (.pdf)
Morlec, Y., G.
Bailly & V. Aubergé (1997)
Generating the prosody of
attitudes, ETRW Workshop on Prosody,
Athens -
Greece: 251-254. (.pdf)
Morlec, Y., G.
Bailly & V. Aubergé (1997) Synthesising
attitudes with global rhythmic
and intonation contours, European Conference on Speech
Communication and Technology,
Rhodes - Greece: 219-222. (.pdf)
Mawass, K., P. Badin & G. Bailly (1997)
Synthesis
of fricative consonants by
audiovisual-to-articulatory inversion, European Conference on
Speech Communication and Technology,
Rhodes - Greece: 1359-1362. (DOI)
1996
Neagu, A. & G.
Bailly (1996) R1,
R2 et R3 : un
ensemble
robuste de paramètres pour la caractérisation des
espaces vocaliques. Journées d'Etudes sur la Parole,
Avignon-France: 247-250.
Morlec Y., G.
Bailly & V. Aubergé (1996) Un
modèle
connexionniste modulaire pour l'apprentissage des gestes intonatifs. Journées
d'Etudes sur la Parole, Avignon-France: 207-210. (.ps)
Morlec Y., G.
Bailly & V. Aubergé (1996) Generating
intonation by superposing
gestures. International
Conference on Speech and Language Processing, Philadelphia -
USA: 283-286. (.pdf)
Beautemps D., P. Badin, G.
Bailly, A. Galvàn & R.
Laboissière (1996) Evaluation
of an articulatory-acoustic model
based on a reference subject. ETRW
on Speech
Production: from
Control Strategies to acoustics, Autrans - France: 45-48.
Bailly G. (1996)
Emergence
de prototypes sensori-moteurs à partir d'exemplaires
audio-visuels. Journées d'Etudes sur la Parole,
Avignon-France: 87-90.
Bailly G. (1996). Building sensori-motor prototypes from audio-visual exemplars. International Conference on Speech and Language Processing (ICSLP), Philadelphia - USA: 957-960.
Badin, P., K. Mawass, G.
Bailly, C. Vescovi D. Beautemps & X. Pelorson
(1996) Articulatory synthesis of fricative
consonants : data and models, ETRW
on Speech Production: from Control Strategies to
acoustics, Autrans - France: 221-224.
Bailly G. (1996)
Sensori-motor control of speech
movements. ETRW on
Speech Production Modelling: from Control Strategies to acoustics,
Autrans: 145-154.
1995
Morlec, Y., V. Aubergé
& G. Bailly
(1995)
Evaluation automatic generation
of
prosody with a superposition
model, International Congress of Phonetic
Sciences, Stockholm - Sweden: 224-227.
(.ps)
Bailly G.,
L.-J. Boé, N. Vallée & P.
Badin (1995) Articulatori-acoustic
prototypes for speech production, European Conference on
Speech Communication and Technology,
Madrid - Spain: 1913-1916.
Bailly G. (1995)
Recovering place of articulation
for occlusives in vcvs, International
Congress of Phonetic Sciences, Stockholm - Sweden: 230-233.
Badin P., B. Gabioud, D. Beautemps, T.
Lallouache, G. Bailly,
S. Maeda,
J.-P. Zerling & G. Brock
(1995) Cineradiography of vcv
sequences articulatory-acoustic data
for a speech production
model, International Congress on Acoustics,
Trondheim - Norway: 349-352. ().pdf)
Aubergé V. & G. Bailly (1995)
Generation of
intonation a global approach, European Conference on Speech
Communication and Technology,
Madrid: 2065-2068. (.pdf)
1994
Bonnyman, J.M., M. Curtis & G. Bailly (1994) a
neural network application for
the analysis and synthesis
of multilingual speech, IEEE
International Conference on Speech,
Image Processing and Neural Networks, Hong Kong: 327-330.
Bailly G.,
E. Castelli & B. Gabioud (1994)
building prototypes for articulatory
speech synthesis, International Workshop on Speech
Synthesis, New Paltz - New
York: 9-12.
Barbosa, P. & G.
Bailly (1994) generating pauses
within the z-score model, ETRW
on Speech Synthesis, New Paltz - New York: 101-104.
1993
Bailly G. (1993)
Resonances as possible representation
of speech in the
auditory-to-articulatory transform. European Conference on
Speech Communication and Technology,
Berlin, 1511-1514. (.pdf)
Alissali, M. & G.
Bailly (1993) Compost: a
client-server model for applications
using text-to-speech. European Conference on Speech
Communication and Technology,
Berlin: 2095-2098.
1992
Barbosa, P. & G.
Bailly (1992)
Génération
automatique des P-centers. Journées d'Etudes sur
la Parole, Bruxelles, Belgique: 357-361.
Barbosa, P. & G.
Bailly (1992) Generating
segmental
duration by P-centers, Fourth Rhythm Workshop:
Rhythm
Perception and Production, Bourges, France, Ville de
Bourges: 163-168
1991
Guerti, M. & G.
Bailly (1991) Synthesis-by-rule using
compost modelling resonance
trajectories. European Conference on Speech
Communication and Technology, Genova - Italy:
43-46.
Bailly G. &
M. Guerti (1991). Synthesis-by-rule for
French. International
Congress of
Phonetic Sciences, Aix-en-Provence, France:
506-509.
Alissali,
M. & G. Bailly
(1991). COMPOST: un serveur de synthèse
multilingue. 8e Congrès sur la Reconnaissance de
Formes et l'Intelligence Artificielle, Lyon-Villeurbanne:
183-192.
1990
Yé,
H., S. Wang, G. Bailly
& F. Robert (1990). Exploration of temporal
processing of a sequential network for speech parameter estimation. Applications
of Artificial Neural Networks, Orlando,
Florida: 16-20.
Wang, H., G.
Bailly &
D. Tuffelli (1990). Automatic segmentation and alignment of continuous
speech based on the temporal decomposition model. International
Conference on Speech and Language Processing,
Kobe, Japan: 457-460.
Barbe, T. & G.
Bailly (1990). Evaluation
d'un détecteur de fréquence
fondamentale du signal microphonique par comparaison à une
référence
laryngographique. Journées d'Etudes sur la Parole,
Montréal:
165-169.
Bailly G. &
M. Guerti (1990). Anticipation
et rétention dans les mouvements vocaliques en
Français. Journées d'Etudes sur la
Parole, Montréal - Canada: 292-295.
Bailly G.,
T. Barbe & H. Wang (1990). Automatic labelling of large
prosodic
databases: tools, methodology and links with a text-to-speech system. ETRW
Workshop on Speech Synthesis, Autrans - France: 201-204.
Bailly G. (1990).
Robotics in speech production: Motor
control theory. ETRW
Tutorial Day on
Speech Synthesis, Autrans - France, 17-26.
1989
and before
Bailly G. &
A. Tran
(1989). COMPOST: a rule-compiler for
speech synthesis. European
Conference on Speech Communication and Technology: 136-139.
Bailly G.,
P.-F. Marteau & C. Abry (1989). A new algorithm for temporal
decomposition of speech. Application to a numerical model of
coarticulation. IEEE International Conference on Acoustics,
Speech, and Signal Processing, Glasgow, Scotland: 508-511.
Marteau P.-F., G.
Bailly & M.-T. Janot-Giorgetti (1988).
Stochastic model of
diphone-like segments based on trajectory concepts. IEEE
International Conference on Acoustics, Speech, and Signal Processing,
New York - USA: 615-618.
Bailly
G., G. Murillo, O.
A. Dakkak & B.
Guérin (1988). A
text-to-speech synthesis system for French by formant synthesis. 7th
FASE Symposium: 225-260.
Dakkak, O. A., G. Murillo, G. Bailly & B. Guérin (1987). Using Contextual Information in View of Formant Analysis Improvement. Recent advances in Speech Understanding and dialog systems, Bad Windsheim - Germany, NATO ASI Series
Dakkak O. A., G. Murillo & G. Bailly (1987).
Automatic
extraction of formant
parameters using a-priori knowledge. IASTED, Applied Control
Filtering and Signal Processing, Geneva –
Switzerland
Bailly G. & J. Liu (1987).
Détection
d'indices par
quantification
vectorielle et réseaux Markoviens. Journées
d'Etudes sur la Parole,
Hammamet - Tunisie, GALF: 60-63.
Bailly G. (1986).
Détection
du fondamental par AMDF et programmation dynamique. Journées
d'Etudes sur la Parole, Aix-en-Provence - France, GALF:
285-288.
Bailly G. (1986).
Un modèle de congruence relationnel pour la
synthèse
de la prosodie du français. Journées
d'Etudes sur la Parole, Aix-en-Provence - France, GALF:
75-78.
International conferences without review process
Bosse M.-L., G.
Bailly and E. Gerbier (2015). Acquisition
de l’orthographe d’un mot nouveau pendant la
lecture d’un texte : une
étude oculométrique en lecture
synchrone. Symposium
International sur la Littératie
à l’Ecole (SILE), Sherbrooke, Canada.
Gerbier, E., G.
Bailly
and M.-L.
Bosse (2015). The
effect of audiovisual synchronization in
reading while listening to texts: An eye-tracking study. Conference of
the European Society for Cognitive Psychology (ESCOP),
Paphos, Cyprus.
Hueber, T., A. Ben Youssef, G. Badin, G. Bailly
& F. Elisei (2011). Articulatory-to-acoustic mapping:
application to silent speech
interface and visual articulatory feedback. Pan-European Voice
Conference (PEVOC).
Marseille.
Badin, P., F. Elisei, L. Huang, Y.
Tarabalka & G.
Bailly (2008).
Vision of tongue in augmented speech: contribution to speech
comprehension and visual tracking strategies. In Speech and
Face-to-Face
communication - A workshop / Summer School dedicated to
the Memory of Christian Benoît: pp.97. Grenoble, France,
27-29 october 2008.
Bailly G.,
O. Govokhina & G. Breton (2008). Multimodal
control of talking heads. Acoustics. Paris.
Laboissière R., J.-L.
Schwartz & G.
Bailly (1991). Motor control for speech
skills a connectionist approach. Proceedings
of the 1990 Connectionist Models Summer
School. D. S. Touretzky, J. L. Elman, T. J. Sejnowski and G.
E. Hinton. San Mateo, CA, Morgan
Kaufmann: 319-327
Wang H., G.
Bailly & D. Tuffelli (1990). Automatic
segmentation and alignment of
continuous speech based on
the temporal decomposition model, Journal
of the Acoustical Society of America: S106.
Bailly G.,
M.
Jordan, M. Mantakas, J.-L. Schwartz, M. Bach & M. Olesen
(1990).
Simulation of vocalic gestures
using an articulatory model
driven by a sequential
neural network, Journal of the
Acoustical Society of America:
S105.
National conferences
Lenglet M., O. Perrotin & G. Bailly (2023) A Closer Look at Latent Representations of End-to-end TTS Models, Journée commune AFIA-TLH / AFCP – “Extraction de connaissances interprétables pour l’étude de la communication parlée”, Avignon, France. (HAL)
Briglia, A., E. Godde, C. Boggio, M.-L. Bosse & G. Bailly (2023) ELARGIR : s’entraîner à lire avec fluence et expressivité. Rencontres des Jeunes Chercheurs en Parole (RJCP, Grenoble, France. (HAL)
Koelsch, L., G. Bailly , F. Elisei, P. Huguet & L. Ferrand (2021) L’impact des robots sur notre cognition: l’effet de présence robotique, Workshop sur
les Affects, Compagnons Artificiels et Interactions (WACAI), Oléron, France. (HAL)
Godde, E., G. Bailly & M.-L. Bosse (2019). Un
Karaoké pour Entraîner la Prosodie en
Lecture, Conférence
sur les Environnements Informatiques pour l’Apprentissage
Humain (EIAH), Paris,
France: pp. 363-366. (HAL)
Nguyen D.C., F. Elisei & G. Bailly (2016).
Demonstrating to a humanoid robot how to conduct neuropsychological
tests.
Journées Nationales de Robotique Humanoïde
(JNRH), Toulouse, France: pp.10-12. (HAL)
Gerbier, E., G. Bailly and M.-L. Bosse (2015). Effet de la synchronie audio-visuelle en lecture sur les mouvements oculaires. Congrès National de la Société Française de Psychologie, Strasbourg, France.
Bailly G.,
M.
Chetouani, M. Ochs, A. Pauchet and H. Fiorino (2014). Virtual
conversational agents and social robots: converging challenges. Workshop
Affect, Compagnon Artificiel, Interaction, Rouen, France.
Mihoub, A., G.
Bailly and
C. Wolf (2014). Modeling sensory-motor behaviors for social robots. Workshop
Affect, Compagnon Artificiel, Interaction, Rouen, France.
Fagel, S. and G.
Bailly (2010). Speech, gaze and head motion in a
face-to-face collaborative task. Electronic Speech Signal
Processing
(ESSV). Berlin. (HAL;
.pdf)
Ben Youssef, A., V.-A. Tran, P. Badin and G. Bailly
(2009). HMMs and GMMs based methods in acoustic-to-articulatory speech
inversion. 8èmes
Rencontres Jeunes Chercheurs en Parole,
Avignon
- France, A182.
Badin, P., G.
Bailly, F. Elisei, L. Lamalle, C. Savariaux & A.
Serrurier (2007). Têtes parlantes audiovisuelles virtuelles:
données, modèles et applications. Congres de la
Société Française de Phoniatrie,
Paris.
Casari A., F. Elisei, G.
Bailly & S. Raidt (2006). Contrôle du
regard et des
mouvements
des paupières d'une tête parlante virtuelle. Workshop sur
les Agents Conversationnels Animés (WACA), Toulouse -
France. (.pdf)
Raidt, S., G.
Bailly & F. Elisei (2006). Plateforme
expérimentale de
capture-restitution croisée pour l'étude de la
communication face-à-face. Workshop sur les Agents
Conversationnels Animés, Toulouse - France. (.pdf)
Picot, A., G.
Bailly, F. Elisei & S. Raidt (2006). Scrutation de
scènes naturelles par un
agent conversationnel animé. Workshop sur les Agents
Conversationnels Animés, Toulouse - France.(.pdf)
Alami,
R., G. Bailly,
L. Brèthes, R. Chatila, A. Clodic, J. Crowley, P.
Danès,
F. Elisei, S. Fleury, M. Herrb, F. Lerasle, & P. Menezes (2005)
HR+: Towards an interactive
autonomous robot. in Journées du programme
interdisplinaire ROBEA. Montpellier - France. (.pdf)
Alami,
R., G. Bailly
& J. Crowley (2004) HR+:
Pour une
interaction
homme-robot autonome. Journées du programme
interdisplinaire
ROBEA, Toulouse – France. (.pdf)
Bas, P.,
J. Lienard,
J.-M.
Chassery, D. Beautemps & G.
Bailly (2003). Artus: animation
réaliste par tatouage audiovisuel à l'usage des
sourds. Journée Nationale sur « Image et
Signal pour le Handicap », Paris. (.pdf)
Niswar, A., G.
Bailly & K. Kroschel (2003) Construction of an
individualized
visual speech synthesizer from
orthogonal 2d images. Institute
for Automation and Robotics, Duisburg – Germany. (.pdf)
Alami R., G.
Bailly & J.-L. Crowley (2002) HR+: Pour une
interaction homme-robot
autonome. Journées du programme interdisplinaire
ROBEA,
Toulouse - France: 39-40.
Elisei F., G.
Bailly, M. Odisio & P. Badin (2001) Clones
parlants
vidéo-réalistes: application à
l'interprétation de FAP-MPEG4. Colloque sur la
Compression
et Représentation des Signaux Audiovisuels (CORESA),
Dijon - France: 145-148. (.ps)
Odisio M., F. Elisei, G.
Bailly & P. Badin (2001) Clones
parlants
vidéo-réalistes: application à
l'analyse de messages audiovisuels. Colloque sur la
Compression
et Représentation des Signaux Audiovisuels (CORESA),
Dijon -
France: 141-144. (.ps)
Bailly G.,
F. Elisei & P. Badin (2001)
Télécommunications
virtuelles et clones parlants. Premières
Rencontres des Sciences et Technologies de l'Information (ASTI),
La Villette -
France: 61.
Bailly G.,
L. Revéret, P. Borel & P. Badin
(2000) hearing by eyes thanks
to the labiophone exchanging
speech movements, COST254
workshop: Friendly Exchanging Through The Net, Bordeaux
– France. (.ps)
Rilliard A., V. Aubergé, G. Bailly &
Y.
Morlec (1997) Vers
une mesure de
l'information
linguistique véhiculée par la prosodie. 1ères
JST FRANCIL, Avignon-France: 481-487.
Morlec Y., G.
Bailly & V. Aubergé (1997)
Apprentissage
automatique d'un
module de génération multistyle de l'intonation. 1ères
JST FRANCIL, Avignon-France: 407-412.
d'Alessandro,
C., V.
Aubergé, G.
Bailly, F. Béchet, P. Boula de
Mareuil, S. Foukia, J.-P. Goldman, J. F. Isabelle, E. Keller, A.
Marchal, P. Mertens, V. Pagel, D. O'Shaughnessy, G. Richard, M. H.
Talon, E. Wehrli & F. Yvon (1997). Vers l'évaluation
de
systèmes de synthèse de parole à
partir du texte en français. 1ères JST
Francil de l'AUPEL-UREF, Avignon - France: 393-398.
Bailly G. (1995).
Pistes de
recherches en synthèse de la parole. École
Thématique: fondements et perspectives en traitement
automatique de la parole. H. Méloni. Marseille -
Luminy - France, Université d'Avignon et des Pays de
Vaucluse:
211-220.
Laboissière, R., J.-L.
Schwartz & G.
Bailly (1992)
modélisation du contrôle moteur
en production de la parole
vers un robot parlant, Cinquième Colloque de l'ARC, Nancy, France: 115-124.
Bailly G.,
M. Bach, M. Olesen, J.-L. Schwartz & A. Morris (1990).
Génération
de trajectoires articulatoires par réseau
séquentiel. 5èmes Journées
NSI, Aussois - France: 191-192.
Bailly G. &
J. Liu (1989). Détection
de formants par quantification vectorielle et réseaux
Markoviens. Actes du séminaire Décodage
Acoustico-Phonétique, Nancy - France, Greco
Dialogue Homme-Machine: 89-94.
Academic documents
Bailly G. (2000).
Représentations phonétiques et technologies
vocales. Habilitation à Diriger des Recherches. Grenoble,
Institut National Polytechnique. (.pdf).
Bailly G. (1983).
Contribution
à la détermination automatique de la prosodie du
Français parlé à partir d'une analyse
syntaxique. Établissement d'un modèle de
génération, Institut National Polytechnique,
Grenoble - France
GrenobleImagesParoleSignalAutomatique laboratoire
UMR 5216 CNRS - Grenoble
INP -
Université Joseph Fourier - Université Stendhal