International journals with reviews

  1. S. Sadok, S. Leglaive, L. Girin, X. Alameda-Pineda et R. Séguier, “Learning and controlling the source-filter representation of speech with a variational autoencoder,” Speech Communication, vol. 148, p. 53-65, 2023. https://hal.science/hal-03650569v3

  2. 2. X. Bie, S. Leglaive, X. Alameda-Pineda et L. Girin, “Unsupervised speech enhancement using dynamical variational autoencoders,” IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 30, p. 2993-3007, 2022. https://hal.science/hal-03295630v1

  3. P.-A. Grumiaux, S. Kitić, L. Girin et A. Guérin, “A survey of sound source localization with deep learning methods,” The Journal of the Acoustical Society of America, vol. 152, no. 1, p. 107-151, 2022. https://hal.science/hal-03952034v1

  4. Y. Ban, X. Alameda-Pineda, L. Girin et R. Horaud, “Variational Bayesian inference for audio-visual tracking of multiple speakers,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 43, no. 5, p. 1761-1776, 2021. https://hal.science/hal-01950866v2

  5. L. Girin, S. Leglaive, X. Bie, J. Diard, T. Hueber et X. Alameda-Pineda, “Dynamical variational autoencoders: a comprehensive review,” Foundations and Trends in Machine Learning, vol. 15, no. 1-2, p. 1-175, 2021. https://hal.science/hal-02926215v2

  6. F. Roche, T. Hueber, M. Garnier, S. Limier et L. Girin, “Make that sound more metallic: Towards a perceptually relevant control of the timbre of synthesizer sounds using a variational autoencoder,” Transactions of the International Society for Music Information Retrieval, vol. 4, p. 52-66, 2021. https://hal.science/hal-03247371v1

  7. T. Hueber, E. Tatulli, L. Girin et J.-L. Schwartz, “Evaluating the potential gain of auditory and audiovisual speech predictive coding using deep learning,” Neural Computation, vol. 32, no. 3, p. 596-625, 2020. https://hal.science/hal-03016083v1

  8. M. Sadeghi, S. Leglaive, X. Alameda-Pineda, L. Girin et R. Horaud, “Audio-visual speech enhancement using conditional variational auto-encoders,” IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 28, p. 1788-1800, 2020. https://hal.science/hal-02364900v3

  9. P. Laffitte, Y. Wang, D. Sodoyer et L. Girin, “Assessing the performances of different neural networks architectures for the detection of screams and shouts in public transportation,” Expert Systems With Applications, vol. 117, p. 29-41, 2019. https://hal.science/hal-01892436v1

  10. X. Li, Y. Ban, L. Girin, X. Alameda-Pineda et R. Horaud, “Online localization and tracking of multiple moving speakers in reverberant environments,” IEEE Journal of Selected Topics in Signal Processing, vol. 13, no. 1, p. 88-103, 2019. https://hal.science/hal-01851985v2

  11. X. Li, L. Girin, S. Gannot et R. Horaud, “Multichannel online dereverberation based on spectral magnitude inverse filtering,” IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 27, no. 9, p. 1365-1377, 2019. https://hal.science/hal-01969041v1

  12. X. Li, L. Girin, S. Gannot et R. Horaud, “Multichannel speech separation and enhancement using the convolutive transfer function,” IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 27, no. 3, p. 645-659, 2019. https://hal.science/hal-01799809v1

  13. X. Li, L. Girin et R. Horaud, “Expectation-maximization for speech source separation using the convolutive transfer function,” CAAI Transactions on Intelligence Technology, vol. 4, no. 1, p. 47-53, 2019. https://hal.science/hal-01982250v1

  14. X. Li, S. Leglaive, L. Girin et R. Horaud, “Audio-noise power spectral density estimation using long short-term memory,” IEEE Signal Processing Letters, vol. 26, no. 6, p. 918-922, 2019. https://hal.science/hal-02100059v1

  15. X. Li, S. Gannot, L. Girin et R. Horaud, “Multichannel identification and nonnegative equalization for dereverberation and noise reduction based on convolutive transfer function,” IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 26, no. 10, p. 1755-1768, 2018. https://hal.science/hal-01645749v3

  16. F. Bocquelet, T. Hueber, L. Girin, S. Chabardès et B. Yvert, “Key considerations in designing a speech brain-computer interface,” Journal of Physiology - Paris, vol. 110, no. 4, p. 392- 401, 2017. https://hal.science/hal-01978301v1

  17. D. Fabre, T. Hueber, L. Girin, X. Alameda-Pineda et P. Badin, “Automatic animation of an articulatory tongue model from ultrasound images of the vocal tract,” Speech Communication, vol. 93, no. 9, p. 63-75, 2017.

  18. L. Girin, T. Hueber et X. Alameda-Pineda, “Extending the cascaded Gaussian mixture regression framework for cross-speaker acoustic-articulatory mapping,” IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 25, no. 3, p. 662-673, 2017. https://hal.science/hal-01485540v1

  19. X. Li, L. Girin, R. Horaud et S. Gannot, “Multiple-speaker localization based on direct-path features and likelihood maximization with spatial sparsity regularization,” IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 25, no. 10, p. 1007-2012, 2017. https://hal.science/hal-01413417v1

  20. F. Ben Ali, S. Djaziri-Larbi et L. Girin, “A low bit-rate speech codec based on a long-term harmonic plus noise model,” Journal of the Audio Engineering Society, vol. 64, no. 11, p. 1-14, 2016. https://hal.science/hal-02520614v1

  21. F. Bocquelet, T. Hueber, L. Girin, C. Savariaux et B. Yvert, “Real-time control of an articulatory-based speech synthesizer for brain-computer interfaces,” PLOS Computational Biology, vol. 12, no. 11, 28p, 2016.

  22. D. Kounades-Bastian, L. Girin, X. Alameda-Pineda, S. Gannot et R. Horaud, “A variational EM algorithm for the separation of time-varying convolutive audio mixtures,” IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 24, no. 8, p. 1408-1423, 2016. https://hal.science/hal-01301762v1

  23. X. Li, L. Girin, R. Horaud et S. Gannot, “Estimation of the direct-path relative transfer function for supervised sound source localization,” IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 24, no. 11, p. 2171-2186, 2016. https://hal.science/hal-01349691v1

  24. A. Deleforge, R. Horaud, Y. Y. Schechner et L. Girin, “Co-localization of audio sources in images using binaural features and locally-linear regression,” IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 23, no. 4, p. 718-731, 2015. https://hal.science/hal-01112834v3

  25. T. Hueber, L. Girin, X. Alameda-Pineda et G. Bailly, “Speaker-adaptive acoustic-articulatory inversion using cascaded Gaussian mixture regression,” IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 23, no. 12, p. 2246-2259, 2015.

  26. J. Pinel, L. Girin et C. Baras, “A high-rate data hiding technique for uncompressed audio signals,” Journal of the Audio Engineering Society, vol. 62, no. 6, p. 400-413, 2014.

  27. K. Grabski, P. Tremblay, V. Gracco, L. Girin et M. Sato, “A mediating role of the auditory dorsal pathway in selective adaptation to speech : a state-dependent transcranial magnetic stimulation study,” Brain Research, vol. 1515, no. 5, p. 55-65, 2013.

  28. S. Zhang et L. Girin, “Fast and accurate direct MDCT-to-DFT conversion with arbitrary window functions,” IEEE Transactions on Audio, Speech, and Language Processing, vol. 21, no. 3, p. 567-578, 2013. https://hal.science/hal-00807031v1

  29. A. Liutkus, J. Pinel, R. Badeau, L. Girin et G. Richard, “Informed source separation through spectrogram coding and data embedding,” Signal Processing, vol. 92, no. 8, p. 1937-1949, 2012. https://hal.science/hal-00643957v1

  30. S. Marchand, B. Mansencal et L. Girin, “Interactive music with active audio CDs,” Lecture Notes in Computer Science, vol. 6684, p. 31-50, 2011.

  31. M. Parvaix et L. Girin, “Informed source separation of linear instantaneous under-determined audio mixtures by source index embedding,” IEEE Transactions on Audio, Speech, and Language Processing, vol. 19, no. 6, p. 1721-1733, 2011. https://hal.science/hal-00695763v1

  32. L. Girin, “Adaptive long-term coding of LSF parameters trajectories for large-delay/very-to ultra-low bitrate speech coding,” EURASIP Journal on Audio, Speech, and Music Processing, vol. 2010, Article ID 597039, 2010. https://hal.science/hal-00534492v1

  33. M. Parvaix, L. Girin et J.-M. Brossier, “A watermarking-based method for informed source separation of audio signals with a single sensor,” IEEE Transactions on Audio, Speech, and Language Processing, vol. 18, no. 6, p. 1464-1475, 2010. https://hal.science/hal-00486809v1

  34. D. Sodoyer, B. Rivet, L. Girin, C. Savariaux, J.-L. Schwartz et C. Jutten, “A study of lip movements during spontaneous dialog and its application to voice activity detection,” Journal of the Acoustical Society of America, vol. 125, no. 2, p. 1184-1196, 2009. https://hal.science/hal-00941145v1

  35. L. Girin, M. Firouzmand et S. Marchand, “Perceptual long-term variable-rate sinusoidal modeling of speech,” IEEE Transactions on Audio, Speech, and Language Processing, vol. 15, no. 3, p. 851-861, 2007.

  36. B. Rivet, L. Girin et C. Jutten, “Log-Rayleigh distribution: a simple and efficient statistical representation of log-spectral coefficients,” IEEE Transactions on Audio, Speech, and Language Processing, vol. 15, no. 3, p. 796-802, 2007. https://hal.science/hal-00174096v1

  37. B. Rivet, L. Girin et C. Jutten, “Mixing audiovisual speech processing and blind source separation for the extraction of speech signals from convolutive mixtures,” IEEE Transactions on Audio, Speech, and Language Processing, vol. 15, no. 1, p. 96-108, 2007. https://hal.science/hal-00174100v1

  38. B. Rivet, L. Girin et C. Jutten, “Visual voice activity detection as a help for speech source separation from convolutive mixtures,” Speech Communication, vol. 49, no. 7, p. 667-677, 2007. https://hal.science/hal-00499184v1

  39. G. Bailly, V. Attina, C. Baras, P. Bas, S. Baudry, D. Beautemps, R. Brun, J.-M. Chassery, F. Davoine, F. Elisei, G. Gibert, L. Girin et al., “ARTUS: synthesis and audiovisual watermarking of the movements of a virtual agent interpreting subtitles using cued speech for deaf televiewers,” AMSE Modelling, Measurement and Control-C, vol. 67, no. 2, p. 177-187, 2006. https://hal.science/hal-00157826v1

  40. L. Girin, “Joint matrix quantization of face parameters and LPC coefficients for low bit rate audiovisual speech coding,” IEEE Transactions on Speech and Audio Processing, vol. 12, no. 3, p. 265-276, 2004.

  41. D. Sodoyer, L. Girin, C. Jutten et J.-L. Schwartz, “Developing an audio-visual speech source separation algorithm,” Speech Communication, vol. 44, no. 1, p. 113-125, 2004.

  42. D. Sodoyer, J.-L. Schwartz, L. Girin, J. Klinkisch et C. Jutten, “Separation of audio-visual speech sources: a new approach exploiting the audio-visual coherence of speech stimuli,” EURASIP Journal on Advances in Signal Processing, vol. 2002, no. 11, p. 1165-1173, 2002.

  43. L. Girin, J.-L. Schwartz et G. Feng, “Audio-visual enhancement of speech in noise,” Journal of the Acoustical Society of America, vol. 109, no. 6, p. 3007-3020, 2001.

  44. L. Girin, G. Feng et J.-L. Schwartz, “Débruitage de parole par un filtrage utilisant l’image du locuteur : une étude de faisabilité,” Traitement du Signal, vol. 13, no. 4, p. 319-334, 1996.

Book chapters

  1. L. Girin, X. Li et S. Gannot, “Audio source separation into the wild,” in Multimodal behavior analysis in the wild, X. Alameda-Pineda, E. Ricci et N. Sebe, éd., Elsevier Academic Press, 2018, p. 58-78.

  2. G. Feng et L. Girin, “Principles of speech coding,” in Spoken Language Processing, J. Mariani, éd., ISTE Ltd / John Wiley et Sons, London, 2009.

  3. G. Feng et L. Girin, “Principes du codage de la parole,” in Traitement automatique du langage parlé, Tome 1 : Analyse, synthèse et codage de la parole, J. Mariani, éd., Hermès-Lavoisier, Paris, 2002.

International conferences with reviews

  1. X. Lin, X. Bie, S. Leglaive, L. Girin et X. Alameda-Pineda, “Speech modeling with a hierarchical Transformer dynamical VAE,” in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Rhodes, Greece, 2023. https://hal.science/hal-04132313v1

  2. X. Lin, S. Leglaive, L. Girin et X. Alameda-Pineda, “Unsupervised speech enhancement with deep dynamical generative speech and noise models,” in Conference of the International Speech Communication Association (INTERSPEECH), Dublin, Ireland, 2023. https://hal.science/hal-04132312v1

  3. M.-A. Georges, J. Diard, L. Girin, J.-L. Schwartz et T. Hueber, “Repeat after me: Self-supervised learning of acoustic-to-articulatory mapping by vocal imitation,” in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Singapore, 2022. https://hal.science/hal-03688189v1

  4. B. Stephenson, L. Besacier, L. Girin et T. Hueber, “BERT, can HE predict contrastive focus? Predicting and controlling prominence in neural TTS using a language model,” in Conference of the International Speech Communication Association (INTERSPEECH), Incheon, Korea, 2022. https://hal.science/hal-03791472v1

  5. X. Bie, L. Girin, S. Leglaive, T. Hueber et X. Alameda-Pineda, “A benchmark of dynamical variational autoencoders applied to speech spectrogram modeling,” in Conference of the International Speech Communication Association (INTERSPEECH), Brno, Czech Republic, 2021. https://hal.science/hal-03295657v1

  6. M.-A. Georges, L. Girin, J.-L. Schwartz et T. Hueber, “Learning a robust speech representation with an articulatory-regularized variational autoencoder,” in Conference of the International Speech Communication Association (INTERSPEECH), Brno, Czech Republic, 2021. https://hal.science/hal-03373252v1

  7. P.-A. Grumiaux, S. Kitic, L. Girin et A. Guérin, “Improved feature extraction for CRNN-based multiple sound source localization,” in European Signal Processing Conference (EUSIPCO), Dublin, Ireland, 2021. https://hal.science/hal-03537334v1

  8. P.-A. Grumiaux, S. Kitic, P. Srivastava, L. Girin et A. Guérin, “SALADnet: self-attentive multisource localization in the Ambisonics domain,” in IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), New Paltz, USA, 2021. https://hal.science/hal-03537340v1

  9. B. Stephenson, L. Besacier, L. Girin et T. Hueber, “Alternate endings: Improving prosody for incremental neural TTS with predicted future text input,” in Conference of the International Speech Communication Association (INTERSPEECH), Brno, Czech Republic, 2021. https://hal.science/hal-03372802v1

  10. M.-A. Georges, P. Badin, J. Diard, L. Girin, J.-L. Schwartz et T. Hueber, “Towards an articulatory-driven neural vocoder for speech synthesis,” in International Seminar on Speech Production (ISSP), New Haven, CT, USA, 2020. https://hal.science/hal-03184762v1

  11. P.-A. Grumiaux, S. Kitic, L. Girin et A. Guérin, “High-resolution speaker counting in reverberant rooms using CRNN with Ambisonics features,” in European Signal Processing Conference (EUSIPCO), Amsterdam, The Netherlands, 2020. https://hal.science/hal-03537323v1

  12. P.-A. Grumiaux, S. Kitic, L. Girin et A. Guérin, “Multichannel source counting with a CRNN: Analysis of the performance,” in Forum Acusticum, Lyon, France, 2020. https://hal.science/hal-03235360v1

  13. S. Leglaive, X. Alameda-Pineda, L. Girin et R. Horaud, “A recurrent variational autoencoder for speech enhancement,” in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Barcelona, Spain, 2020. https://hal.science/hal-02329000v2

  14. B. Stephenson, L. Besacier, L. Girin et T. Hueber, “What the future brings: Investigating the impact of lookahead for incremental neural TTS,” in Conference of the International Speech Communication Association (INTERSPEECH), Shanghai, China, 2020. https://hal.science/hal-02962234v1

  15. X. Alameda-Pineda, S. Arias, Y. Ban, G. Delorme, L. Girin, R. Horaud, X. Li, B. Morgue et G. Sarrazin, “Audio-visual variational fusion for multi-person tracking with robots,” in ACM International Conference on Multimedia (ACMMM), Nice, France, 2019, p. 1059-1061. https://hal.science/hal-02354514v1

  16. R. Frisch, M. Faix, E. Mazer, J. Droulez et L. Girin, “Bayesian time-domain multiple sound source localization for a stochastic machine,” in European Signal Processing Conference (EU- SIPCO), A Coruna, Spain, 2019. https://hal.science/hal-02377220v1

  17. L. Girin, F. Roche, S. Leglaive et T. Hueber, “Notes on the use of variational autoencoders for speech and audio spectrogram modeling,” in International Conference on Digital Audio Effects (DAFx), Birmingham, UK, 2019. https://hal.science/hal-02349385v1

  18. S. Leglaive, L. Girin et R. Horaud, “Semi-supervised multichannel speech enhancement with variational autoencoders and non-negative matrix factorization,” in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Brighton, UK, 2019. https://hal.science/hal-02005102v2

  19. S. Leglaive, U. Simsekli, A. Liutkus, L. Girin et R. Horaud, “Speech enhancement with variational autoencoders and alpha-stable distributions,” in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Brighton, UK, 2019. https://hal.science/hal-02005106v1

  20. F. Roche, T. Hueber, S. Limier et L. Girin, “Autoencoders for music sound modeling: a comparison of linear, shallow, deep, recurrent and variational models,” in Sound and Music Computing Conference (SMC), Malaga, Spain, 2019. https://hal.science/hal-02349406v1

  21. Y.-T. Ban, X. Li, X. Alameda-Pineda, L. Girin et R. Horaud, “Accounting for room acoustics in audio-visual multi-speaker tracking,” in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Calgary, Canada, 2018. https://hal.science/hal-01718114v1

  22. S. Leglaive, L. Girin et R. Horaud, “A variance modeling framework based on variational autoencoders for speech enhancement,” in IEEE International Workshop on Machine Learning for Signal Processing (MLSP), Aalborg, Denmark, 2018. https://hal.science/hal-01832826v1

  23. X. Li, Y. Ban, L. Girin, X. Alameda-Pineda et R. Horaud, “A cascaded multi-speaker localization and tracking system,” in IEEE Workshop on Acoustic Signal Enhancement (IWAENC) – LOCATA Challenge Workshop, Tokyo, Japan, 2018.

  24. X. Li, S. Gannot, L. Girin et R. Horaud, “Multisource MINT using convolutive transfer function,” in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Calgary, Canada, 2018. https://hal.science/hal-01718106v1

  25. X. Li, B. Mourgue, L. Girin, S. Gannot et R. Horaud, “Online localization of multiple moving speakers in reverberant environments,” in IEEE Workshop on Sensor Array and Multichannel Signal Processing (SAM), Sheffield, UK, 2018. https://hal.science/hal-01795462v1

  26. Q. Nguyen, L. Girin, G. Bailly, F. Elisei et D.-C. Nguyen, “Autonomous sensorimotor learning for sound source localization by a humanoid robot,” in IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) – Workshop on Crossmodal Learning for Intelligent Robotics, Madrid, Spain, 2018. https://hal.science/hal-01921882v1

  27. Y.-T. Ban, L. Girin, X. Alameda-Pineda et R. Horaud, “Exploiting the complementarity of audio and visual data in multi-speaker tracking,” in International Conference on Computer Vision, Workshop on Computer Vision for Audio-Visual Media, Venice, Italy, 2017. https://hal.science/hal-01577965v1

  28. M. Fontaine, A. Liutkus, L. Girin et R. Badeau, “Explaining the parameterized Wiener filter with Alpha-stable processes,” in IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), New Paltz, NJ, USA, 2017. https://hal.science/hal-01548508v1

  29. R. Frisch, R. Laurent, M. Faix, L. Girin, L. Fesquet, A. Lux, J. Droulez, P. Bessière et E. Mazer, “A Bayesian stochastic machine for sound source localization,” in IEEE International Conference on Rebooting Computing, Washington, DC, USA, 2017. https://hal.science/hal-01644346v1

  30. L. Girin et R. Badeau, “On the use of latent mixing filters in audio source separation,” in International Conference on Latent Variable Analysis and Signal Separation (LVA/ICA), Grenoble, France, 2017. https://hal.science/hal-01400965v1

  31. L. Girin, T. Hueber et X. Alameda-Pineda, “Adaptation of a Gaussian mixture regressor to a new input distribution: Extending the C-GMR framework,” in International Conference on Latent Variable Analysis and Signal Separation (LVA/ICA), Grenoble, France, 2017. https://hal.science/hal-01646098v1

  32. D. Kounades-Bastian, L. Girin, X. Alameda-Pineda, S. Gannot et R. Horaud, “An EM algorithm for joint source separation and diarization of multichannel convolutive speech mixtures,” in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), New Orleans, Louisiana, USA, 2017. https://hal.science/hal-01430761v1

  33. D. Kounades-Bastian, L. Girin, X. Alameda-Pineda, S. Gannot et R. Horaud, “Exploiting the intermittency of speech for joint separation and diarization,” in IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), New Paltz, NJ, USA, 2017. https://hal.science/hal-01568813v1

  34. X. Li, L. Girin et R. Horaud, “An EM algorithm for audio source separation based on the convolutive transfer function,” in IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), New Paltz, NJ, USA, 2017. https://hal.science/hal-01568818v1

  35. X. Li, L. Girin et R. Horaud, “Audio source separation based on convolutive transfer function and frequency-domain Lasso optimization,” in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), New Orleans, Louisiana, USA, 2017. https://hal.science/hal-01430754v1

  36. D. Kounades-Bastian, L. Girin, X. Alameda-Pineda, S. Gannot et R. Horaud, “An inverse-Gamma source variance prior with factorized parameterization for audio source separation,” in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Shanghai, China, 2016, p. 136-140. https://hal.science/hal-01253169v1

  37. P. Laffitte, D. Sodoyer, C. Tatkeu et L. Girin, “Deep neural networks for automatic detection of screams and shouted speech in subway trains,” in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Shanghai, China, 2016, p. 6460-6464. https://hal.science/hal-01385272v1

  38. X. Li, L. Girin, F. Badeig et R. Horaud, “Reverberant sound localization with a robot head based on direct-path relative transfer function,” in IEEE/RSJ International Conference on Intel- ligent Robots and Systems (IROS), Daejeon, South-Korea, 2016. https://hal.science/hal-01349771v1

  39. X. Li, L. Girin, S. Gannot et R. Horaud, “Non-stationary noise power spectral density estimation based on regional statistics,” in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Shanghai, China, 2016, p. 181-185. https://hal.science/hal-01250892v1

  40. X. Li, R. Horaud, L. Girin et S. Gannot, “Voice activity detection based on statistical likelihood ratio with adaptive thresholding,” in IEEE International Workshop on Acoustic Signal Enhancement (IWAENC), Xian, China, 2016. https://hal.science/hal-01349776v1

  41. F. Bocquelet, T. Hueber, L. Girin, C. Savariaux et B. Yvert, “Real-time control of a DNN- based articulatory synthesizer for silent speech conversion: a pilot study,” in Conference of the International Speech Communication Association (INTERSPEECH), Dresden, Germany, 2015, p. 2405-2409. https://hal.science/hal-01726265v1

  42. D. Kounades-Bastian, L. Girin, X. Alameda-Pineda, S. Gannot et R. Horaud, “A variational EM algorithm for the separation of moving sound sources,” in IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), New Paltz, NJ, USA, 2015, p. 1-5. Best Student Paper Awardhttps://hal.science/hal-01169764v2

  43. X. Li, L. Girin, R. Horaud et S. Gannot, “Estimation of relative transfer function in the presence of stationary noise based on segmental power spectral density matrix subtraction,” in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Brisbane, Australia, 2015, p. 320-324. https://hal.science/hal-01119186v1

  44. X. Li, R. Horaud, L. Girin et S. Gannot, “Local relative transfer function for sound source localization,” in European Signal Processing Conference (EUSIPCO), Nice, France, 2015, p. 399- 403. https://hal.science/hal-01163675v1

  45. F. Bocquelet, T. Hueber, L. Girin, P. Badin et B. Yvert, “Robust articulatory speech synthesis using deep neural networks for BCI applications,” in Conference of the International Speech Communication Association (INTERSPEECH), Singapore, 2014. https://hal.science/hal-01228891v1

  46. A. Deleforge, V. Drouard, L. Girin et R. Horaud, “Mapping sounds onto images using binaural spectrograms,” in European Signal Processing Conference (EUSIPCO), Lisbon, Portugal, 2014, p. 2470-2474. https://hal.science/hal-01019287v1

  47. M. Janvier, X. Alameda-Pineda, L. Girin et R. Horaud, “Sound representation and classification benchmark for domestic robots,” in IEEE International Conference on Robotics and Automation (ICRA), Hong-Kong, China, 2014, p. 6285-6292. https://hal.science/hal-00952092v1

  48. S. Kırbız, A. Ozerov, A. Liutkus et L. Girin, “Perceptual coding-based informed source separation,” in European Signal Processing Conference (EUSIPCO), Lisbon, Portugal., 2014, p. 959-963. https://hal.science/hal-01016314v1

  49. M. Janvier, R. Horaud, L. Girin, F. Berthommier, L.-J. Boë, C. Kemp, A. Rey et T. Legou, “Supervised classification of baboon vocalizations,” in Neural Information Processing Scaled for Bioacoustics (NIPS4B), Lake Tahoe, Nevada, USA, 2013. https://hal.science/hal-00910104v1

  50. S. Zhang, L. Girin et A. Liutkus, “Informed source separation from compressed mixtures using spatial Wiener filter and quantization noise estimation,” in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Vancouver, Canada, 2013, p. 61-65. https://hal.science/hal-00940328v1

  51. F. Berthommier, L. Girin et L.-J. Boë, “A simple hybrid acoustic and morphologically constrained technique for the synthesis of stop consonants in various vocalic contexts,” in Conference of the International Speech Communication Association (INTERSPEECH), Portland, USA, 2012. https://hal.science/hal-00807519v1

  52. T. Gerber, M. Dutasta, L. Girin et C. Févotte, “Professionally-produced music separation guided by covers,” in International Society for Music Information Retrieval Conference (ISMIR), Porto, Portugal, 2012. https://hal.science/hal-00807027v1

  53. M. Janvier, X. Alameda-Pineda, L. Girin et R. Horaud, “Sound-event recognition with a companion humanoid,” in IEEE/RAS International Conference on Humanoid Robots (Huma- noids), Osaka, Japan, 2012. https://hal.science/hal-00768767v1

  54. A. Liutkus, S. Gorlow, N. Sturmel, S. Zhang, L. Girin, R. Badeau, L. Daudet, S. Marchand et G. Richard, “Informed audio source separation: a comparative study,” in European Signal Processing Conference (EUSIPCO), Bucharest, Romania, 2012. https://hal.science/hal-00809525v1

  55. S. Marchand, R. Badeau, C. Baras, L. Daudet, D. Fourer, L. Girin, S. Gorlow, A. Liutkus, J. Pinel, G. Richard et al., “DReaM: a novel system for joint source separation and multitrack coding,” in Audio Engineering Society (AES) Convention, San Francisco, USA, 2012. https://hal.science/hal-00809503v1

  56. N. Sturmel, L. Daudet et L. Girin, “Phase-based informed source separation for active listening of music,” in International Conference on Digital Audio Effects (DAFx), York, UK, 2012. https://hal.science/hal-00807001v1

  57. N. Sturmel, A. Liutkus, J. Pinel, L. Girin, S. Marchand, G. Richard, R. Badeau et L. Daudet, “Linear mixing models for active listening of music productions in realistic studio conditions,” in Audio Engineering Society (AES) Convention, Budapest, Hungary, 2012. Best Paper Awardhttps://hal.science/hal-00790783v1

  58. F. Ben Ali, L. Girin et S. Djaziri-Larbi, “A long-term harmonic plus noise model for speech signals,” in Conference of the International Speech Communication Association (INTERSPEECH), Florence, Italy, 2011, p. 53-56. https://hal.science/hal-00695752v1

  59. L. Girin et J. Pinel, “Informed audio source separation from compressed linear stereo mixtures,” in Audio Engineering Society (AES) Conference, Ilmenau, Germany, 2011. https://hal.science/hal-00695724v1

  60. J. Pinel et L. Girin, “A high-rate data hiding technique for audio signals based on IntMDCT quantization,” in International Conference on Digital Audio Effects (DAFx), Paris, France, 2011, p. 353-356. https://hal.science/hal-00695759v1

  61. J. Pinel et L. Girin, “Sparsification of audio signals using the MDCT/IntMDCT and a psychoacoustic model. Application to informed audio source separation,” in Audio Engineering Society (AES) Conference, Ilmenau, Germany, 2011. https://hal.science/hal-00695730v1

  62. S. Zhang et L. Girin, “An informed source separation system for speech signals,” in Conference of the International Speech Communication Association (INTERSPEECH), Florence, Italy, 2011, p. 573-576. https://hal.science/hal-00695758v1

  63. F. Ben Ali, L. Girin et S. Djaziri-Larbi, “Long-term modelling of parameters trajectories for the harmonic plus noise model of speech signals,” in International Congress on Acoustics (ICA), Sydney, Australia, 2010.

  64. S. Marchand, B. Mansencal et L. Girin, “Interactive music with active audio CDs,” in International Symposium on Computer Music Modeling and Retrieval (CMMR), Malaga, Spain, 2010. https://hal.science/hal-00502792v1

  65. M. Mazuel, D. David et L. Girin, “Linking motion sensors and digital signal processing for real- time musical transformations,” in Int. Conf. on Haptic Audio Interaction Design (HAID) (demo session), Copenhagen, Denmark, 2010.

  66. M. Parvaix et L. Girin, “Informed source separation of underdetermined instantaneous stereo mixtures using source index embedding,” in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Dallas, USA, 2010, p. 245-248. https://hal.science/hal-00486804v1

  67. M. Parvaix, L. Girin, L. Daudet, J. Pinel et C. Baras, “Hybrid coding/indexing strategy for informed source separation of linear instantaneous under-determined audio mixtures,” in Interna- tional Congress on Acoustics (ICA), Sydney, Australia, 2010.

  68. J. Pinel, L. Girin, C. Baras et M. Parvaix, “A high-capacity watermarking technique for audio signals based on MDCT-domain quantization,” in International Congress on Acoustics (ICA), Sydney, Australia, 2010.

  69. M. Parvaix, L. Girin et J.-M. Brossier, “A watermarking-based method for single-channel audio source separation,” in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Taipei, Taiwan, 2009, p. 101-104.

  70. M. Firouzmand et L. Girin, “Long-term flexible 2D cepstral modeling of speech spectral amplitudes,” in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Las Vegas, USA, 2008, p. 3937-3940.

  71. K. Hermus, L. Girin, H. Van hamme et S. Irhimeh, “Estimation of the voicing cut-off frequency contour of natural speech based on harmonic and aperiodic energies,” in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Las Vegas, USA, 2008, p. 4473-4476.

  72. A. Aubrey, B. Rivet, Y. Hicks, L. Girin, J. Chambers et C. Jutten, “Two novel visual voice activity detectors based on appearance models and retinal filtering,” in European Signal Processing Conference (EUSIPCO), Poznan, Poland, 2007, p. 2409-2413.

  73. D. Beautemps, L. Girin, N. Aboutabit, G. Bailly, L. Besacier, G. Breton, T. Burger, A. Caplier, M.-A. Cathiard, D. Chêne et f, “TELMA: telephony for the hearing-impaired people. from models to user tests,” in International Conference on Assistive Technologies (ASSISTH), Toulouse, France, 2007, p. 201-208.

  74. L. Girin, “Long-term quantization of speech LSF parameters,” in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Honolulu, Hawaii, USA, 2007.

  75. B. Rivet, A. J. Aubrey, L. Girin, Y. Hicks, C. Jutten et J. Chambers, “Development and comparison of two approaches for visual speech analysis with application to voice activity detection.,” in International Conference on Audio-Visual Speech Processing (AVSP), Hilvarenbeek, The Netherlands, 2007, p. 14.

  76. B. Rivet, L. Girin, C. Serviere, D.-T. Pham et C. Jutten, “Audiovisual speech source separation : a regularization method based on visual voice activity detection,” in International Conference on Audio-Visual Speech Processing (AVSP), Hilvarenbeek, The Netherlands, 2007.

  77. B. Rivet, L. Girin, C. Serviere, D.-T. Pham et C. Jutten, “Using a visual voice activity detector to regularize the permutations in blind separation of convolutive speech mixtures,” in IEEE International Conference on Digital Signal Processing (DSP), Cardiff, Wales, 2007, p. 223-226.

  78. L. Girin, “Theoretical and experimental bases of a new method for accurate separation of harmonic and noise components of speech signals,” in European Signal Processing Conference (EUSIPCO), Florence, Italy, 2006, p. 1-5.

  79. D. Sodoyer, B. Rivet, L. Girin, J.-L. Schwartz et C. Jutten, “An analysis of visual speech information applied to voice activity detection,” in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Toulouse, France, 2006.

  80. M. Firouzmand et L. Girin, “Perceptually weighted long term modeling of sinusoidal speech amplitude trajectories,” in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Philadelphia, USA, 2005.

  81. M. Firouzmand, L. Girin et S. Marchand, “Comparing several models for perceptual long-term modeling of amplitude and phase trajectories of sinusoidal speech,” in Conference of the International Speech Communication Association (INTERSPEECH), Lisbon, Portugal, 2005, p. 357-360.

  82. M. Raspaud, S. Marchand et L. Girin, “A generalized polynomial and sinusoidal model for partial tracking and time stretching,” in International Conference on Digital Audio Effects (DAFx), Madrid, Spain, 2005, p. 24-29.

  83. B. Rivet, L. Girin et C. Jutten, “Solving the indeterminations of blind source separation of convolutive speech mixtures,” in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Philadelphia, USA, 2005.

  84. D. Beautemps, T. Burger et L. Girin, “Characterizing and classifying cued speech vowels from labial parameters,” in International Conference on Spoken Language Processing (ICSLP), Jeju, South Korea, 2004.

  85. L. Girin, M. Firouzmand et S. Marchand, “Long-term modeling of phase trajectories within the speech sinusoidal model framework.,” in International Conference on Spoken Language Processing (ICSLP), Jeju, South-Korea, 2004.

  86. L. Girin et S. Marchand, “Watermarking of speech signals using the sinusoidal model and frequency modulation of the partials,” in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Montreal, Canada, 2004.

  87. B. Rivet, L. Girin, C. Jutten et J.-L. Schwartz, “Using audiovisual speech processing to improve the robustness of the separation of convolutive speech mixtures,” in IEEE Workshop on Multimedia Signal Processing (MMSP), Siena, Italy, 2004, p. 47-50.

  88. L. Girin, “Pure audio McGurk effect,” in International Conference on Audio-Visual Speech Processing (AVSP), Saint-Jorioz, France, 2003.

  89. L. Girin, S. Marchand, J. Di Martino, A. Robel et G. Peeters, “Comparing the order of a polynomial phase model for the synthesis of quasi-harmonic audio signals,” in IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), New Paltz, USA, 2003, p. 193-196.

  90. D. Sodoyer, L. Girin, C. Jutten et J.-L. Schwartz, “Speech extraction based on ICA and audio-visual coherence,” in IEEE International Symposium on Signal Processing and Its Applications (ISSPA), Paris, France, 2003.

  91. D. Sodoyer, L. Girin, C. Jutten et J.-L. Schwartz, “Extracting an av speech source from a mixture of signals,” in European Conference on Speech Communication and Technology (EUROSPEECH), Geneva, Switzerland, 2003.

  92. D. Sodoyer, L. Girin, C. Jutten et J.-L. Schwartz, “Further experiments on audio-visual speech source separation,” in International Conference on Audio-Visual Speech Processing (AVSP), Saint-Jorioz, France, 2003.

  93. D. Sodoyer, L. Girin, C. Jutten et J.-L. Schwartz, “Audio-visual speech source separation,” in International Conference on Spoken Language Processing (ICSLP), Denver, USA, 2002.

  94. L. Girin, A. Allard et J.-L. Schwartz, “Speech signals separation: a new approach exploiting the coherence of audio and visual speech,” in IEEE Workshop on Multimedia Signal Processing (MMSP), Cannes, France, 2001, p. 631-636.

  95. E. Foucher, G. Feng et L. Girin, “A preliminary study of an audio-visual speech coder: using video parameters to reduce an LPC vocoder bit rate,” in European Signal Processing Conference (EUSIPCO), Rhodes, Greece, 1998, p. 1-4.

  96. E. Foucher, L. Girin et G. Feng, “An audiovisual speech coder using vector quantization to exploit the audio/video correlation,” in International Conference on Audio-Visual Speech Processing (AVSP), Sydney, Australia, 1998.

  97. L. Girin, G. Feng et J.-L. Schwartz, “Fusion of auditory and visual information for noisy speech enhancement: a preliminary study of vowel transitions,” in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Seattle, USA, 1998, p. 1005-1008.

  98. L. Girin, E. Foucher et G. Feng, “An audio-visual distance for audio-visual speech vector quantization,” in IEEE Workshop on Multimedia Signal Processing (MMSP), Los Angeles, USA, 1998.

  99. L. Girin, L. Varin, G. Feng et J.-L. Schwartz, “A signal processing system for having the sound pop-out in noise thanks to the image of the speaker’s lips: new advances using multi- layer perceptrons,” in International Conference on Spoken Language Processing (ICSLP), Sydney, Australia, 1998.

  100. L. Girin, L. Varin, G. Feng et J.-L. Schwartz, “Audiovisual speech enhancement: new advances using multi-layer perceptrons,” in IEEE Workshop on Multimedia Signal Processing (MMSP), Los Angeles, USA, 1998.

  101. L. Girin, G. Feng et J.-L. Schwartz, “Noisy speech enhancement by fusion of auditory and visual information: a study of vowel transitions,” in European Conference on Speech Communication and Technology (EUROSPEECH), Rhodes, Greece, 1997.

  102. L. Girin, J.-L. Schwartz et G. Feng, “Can the visual input make the audio signal pop out in noise ? A first study of the enhancement of noisy VCV acoustic sequences by audio-visual fusion,” in International Conference on Audio-Visual Speech Processing (AVSP), Rhodes, Greece, 1997.

  103. L. Girin, G. Feng et J.-L. Schwartz, “Noisy speech enhancement with filters estimated from the speaker’s lips,” in European Conference on Speech Communication and Technology (EUROSPEECH), Madrid, Spain, 1995.