Publications | Gipsa-lab

International journals with reviews

S. Sadok, S. Leglaive, L. Girin, X. Alameda-Pineda et R. Séguier, “Learning and controlling the source-filter representation of speech with a variational autoencoder,” Speech Communication, vol. 148, p. 53-65, 2023. https://hal.science/hal-03650569v3
2. X. Bie, S. Leglaive, X. Alameda-Pineda et L. Girin, “Unsupervised speech enhancement using dynamical variational autoencoders,” IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 30, p. 2993-3007, 2022. https://hal.science/hal-03295630v1
P.-A. Grumiaux, S. Kitić, L. Girin et A. Guérin, “A survey of sound source localization with deep learning methods,” The Journal of the Acoustical Society of America, vol. 152, no. 1, p. 107-151, 2022. https://hal.science/hal-03952034v1
Y. Ban, X. Alameda-Pineda, L. Girin et R. Horaud, “Variational Bayesian inference for audio-visual tracking of multiple speakers,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 43, no. 5, p. 1761-1776, 2021. https://hal.science/hal-01950866v2
L. Girin, S. Leglaive, X. Bie, J. Diard, T. Hueber et X. Alameda-Pineda, “Dynamical variational autoencoders: a comprehensive review,” Foundations and Trends in Machine Learning, vol. 15, no. 1-2, p. 1-175, 2021. https://hal.science/hal-02926215v2
F. Roche, T. Hueber, M. Garnier, S. Limier et L. Girin, “Make that sound more metallic: Towards a perceptually relevant control of the timbre of synthesizer sounds using a variational autoencoder,” Transactions of the International Society for Music Information Retrieval, vol. 4, p. 52-66, 2021. https://hal.science/hal-03247371v1
T. Hueber, E. Tatulli, L. Girin et J.-L. Schwartz, “Evaluating the potential gain of auditory and audiovisual speech predictive coding using deep learning,” Neural Computation, vol. 32, no. 3, p. 596-625, 2020. https://hal.science/hal-03016083v1
M. Sadeghi, S. Leglaive, X. Alameda-Pineda, L. Girin et R. Horaud, “Audio-visual speech enhancement using conditional variational auto-encoders,” IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 28, p. 1788-1800, 2020. https://hal.science/hal-02364900v3
P. Laffitte, Y. Wang, D. Sodoyer et L. Girin, “Assessing the performances of different neural networks architectures for the detection of screams and shouts in public transportation,” Expert Systems With Applications, vol. 117, p. 29-41, 2019. https://hal.science/hal-01892436v1
X. Li, Y. Ban, L. Girin, X. Alameda-Pineda et R. Horaud, “Online localization and tracking of multiple moving speakers in reverberant environments,” IEEE Journal of Selected Topics in Signal Processing, vol. 13, no. 1, p. 88-103, 2019. https://hal.science/hal-01851985v2
X. Li, L. Girin, S. Gannot et R. Horaud, “Multichannel online dereverberation based on spectral magnitude inverse filtering,” IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 27, no. 9, p. 1365-1377, 2019. https://hal.science/hal-01969041v1
X. Li, L. Girin, S. Gannot et R. Horaud, “Multichannel speech separation and enhancement using the convolutive transfer function,” IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 27, no. 3, p. 645-659, 2019. https://hal.science/hal-01799809v1
X. Li, L. Girin et R. Horaud, “Expectation-maximization for speech source separation using the convolutive transfer function,” CAAI Transactions on Intelligence Technology, vol. 4, no. 1, p. 47-53, 2019. https://hal.science/hal-01982250v1
X. Li, S. Leglaive, L. Girin et R. Horaud, “Audio-noise power spectral density estimation using long short-term memory,” IEEE Signal Processing Letters, vol. 26, no. 6, p. 918-922, 2019. https://hal.science/hal-02100059v1
X. Li, S. Gannot, L. Girin et R. Horaud, “Multichannel identification and nonnegative equalization for dereverberation and noise reduction based on convolutive transfer function,” IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 26, no. 10, p. 1755-1768, 2018. https://hal.science/hal-01645749v3
F. Bocquelet, T. Hueber, L. Girin, S. Chabardès et B. Yvert, “Key considerations in designing a speech brain-computer interface,” Journal of Physiology - Paris, vol. 110, no. 4, p. 392- 401, 2017. https://hal.science/hal-01978301v1
D. Fabre, T. Hueber, L. Girin, X. Alameda-Pineda et P. Badin, “Automatic animation of an articulatory tongue model from ultrasound images of the vocal tract,” Speech Communication, vol. 93, no. 9, p. 63-75, 2017.
L. Girin, T. Hueber et X. Alameda-Pineda, “Extending the cascaded Gaussian mixture regression framework for cross-speaker acoustic-articulatory mapping,” IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 25, no. 3, p. 662-673, 2017. https://hal.science/hal-01485540v1
X. Li, L. Girin, R. Horaud et S. Gannot, “Multiple-speaker localization based on direct-path features and likelihood maximization with spatial sparsity regularization,” IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 25, no. 10, p. 1007-2012, 2017. https://hal.science/hal-01413417v1
F. Ben Ali, S. Djaziri-Larbi et L. Girin, “A low bit-rate speech codec based on a long-term harmonic plus noise model,” Journal of the Audio Engineering Society, vol. 64, no. 11, p. 1-14, 2016. https://hal.science/hal-02520614v1
F. Bocquelet, T. Hueber, L. Girin, C. Savariaux et B. Yvert, “Real-time control of an articulatory-based speech synthesizer for brain-computer interfaces,” PLOS Computational Biology, vol. 12, no. 11, 28p, 2016.
D. Kounades-Bastian, L. Girin, X. Alameda-Pineda, S. Gannot et R. Horaud, “A variational EM algorithm for the separation of time-varying convolutive audio mixtures,” IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 24, no. 8, p. 1408-1423, 2016. https://hal.science/hal-01301762v1
X. Li, L. Girin, R. Horaud et S. Gannot, “Estimation of the direct-path relative transfer function for supervised sound source localization,” IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 24, no. 11, p. 2171-2186, 2016. https://hal.science/hal-01349691v1
A. Deleforge, R. Horaud, Y. Y. Schechner et L. Girin, “Co-localization of audio sources in images using binaural features and locally-linear regression,” IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 23, no. 4, p. 718-731, 2015. https://hal.science/hal-01112834v3
T. Hueber, L. Girin, X. Alameda-Pineda et G. Bailly, “Speaker-adaptive acoustic-articulatory inversion using cascaded Gaussian mixture regression,” IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 23, no. 12, p. 2246-2259, 2015.
J. Pinel, L. Girin et C. Baras, “A high-rate data hiding technique for uncompressed audio signals,” Journal of the Audio Engineering Society, vol. 62, no. 6, p. 400-413, 2014.
K. Grabski, P. Tremblay, V. Gracco, L. Girin et M. Sato, “A mediating role of the auditory dorsal pathway in selective adaptation to speech : a state-dependent transcranial magnetic stimulation study,” Brain Research, vol. 1515, no. 5, p. 55-65, 2013.
S. Zhang et L. Girin, “Fast and accurate direct MDCT-to-DFT conversion with arbitrary window functions,” IEEE Transactions on Audio, Speech, and Language Processing, vol. 21, no. 3, p. 567-578, 2013. https://hal.science/hal-00807031v1
A. Liutkus, J. Pinel, R. Badeau, L. Girin et G. Richard, “Informed source separation through spectrogram coding and data embedding,” Signal Processing, vol. 92, no. 8, p. 1937-1949, 2012. https://hal.science/hal-00643957v1
S. Marchand, B. Mansencal et L. Girin, “Interactive music with active audio CDs,” Lecture Notes in Computer Science, vol. 6684, p. 31-50, 2011.
M. Parvaix et L. Girin, “Informed source separation of linear instantaneous under-determined audio mixtures by source index embedding,” IEEE Transactions on Audio, Speech, and Language Processing, vol. 19, no. 6, p. 1721-1733, 2011. https://hal.science/hal-00695763v1
L. Girin, “Adaptive long-term coding of LSF parameters trajectories for large-delay/very-to ultra-low bitrate speech coding,” EURASIP Journal on Audio, Speech, and Music Processing, vol. 2010, Article ID 597039, 2010. https://hal.science/hal-00534492v1
M. Parvaix, L. Girin et J.-M. Brossier, “A watermarking-based method for informed source separation of audio signals with a single sensor,” IEEE Transactions on Audio, Speech, and Language Processing, vol. 18, no. 6, p. 1464-1475, 2010. https://hal.science/hal-00486809v1
D. Sodoyer, B. Rivet, L. Girin, C. Savariaux, J.-L. Schwartz et C. Jutten, “A study of lip movements during spontaneous dialog and its application to voice activity detection,” Journal of the Acoustical Society of America, vol. 125, no. 2, p. 1184-1196, 2009. https://hal.science/hal-00941145v1
L. Girin, M. Firouzmand et S. Marchand, “Perceptual long-term variable-rate sinusoidal modeling of speech,” IEEE Transactions on Audio, Speech, and Language Processing, vol. 15, no. 3, p. 851-861, 2007.
B. Rivet, L. Girin et C. Jutten, “Log-Rayleigh distribution: a simple and efficient statistical representation of log-spectral coefficients,” IEEE Transactions on Audio, Speech, and Language Processing, vol. 15, no. 3, p. 796-802, 2007. https://hal.science/hal-00174096v1
B. Rivet, L. Girin et C. Jutten, “Mixing audiovisual speech processing and blind source separation for the extraction of speech signals from convolutive mixtures,” IEEE Transactions on Audio, Speech, and Language Processing, vol. 15, no. 1, p. 96-108, 2007. https://hal.science/hal-00174100v1
B. Rivet, L. Girin et C. Jutten, “Visual voice activity detection as a help for speech source separation from convolutive mixtures,” Speech Communication, vol. 49, no. 7, p. 667-677, 2007. https://hal.science/hal-00499184v1
G. Bailly, V. Attina, C. Baras, P. Bas, S. Baudry, D. Beautemps, R. Brun, J.-M. Chassery, F. Davoine, F. Elisei, G. Gibert, L. Girin et al., “ARTUS: synthesis and audiovisual watermarking of the movements of a virtual agent interpreting subtitles using cued speech for deaf televiewers,” AMSE Modelling, Measurement and Control-C, vol. 67, no. 2, p. 177-187, 2006. https://hal.science/hal-00157826v1
L. Girin, “Joint matrix quantization of face parameters and LPC coefficients for low bit rate audiovisual speech coding,” IEEE Transactions on Speech and Audio Processing, vol. 12, no. 3, p. 265-276, 2004.
D. Sodoyer, L. Girin, C. Jutten et J.-L. Schwartz, “Developing an audio-visual speech source separation algorithm,” Speech Communication, vol. 44, no. 1, p. 113-125, 2004.
D. Sodoyer, J.-L. Schwartz, L. Girin, J. Klinkisch et C. Jutten, “Separation of audio-visual speech sources: a new approach exploiting the audio-visual coherence of speech stimuli,” EURASIP Journal on Advances in Signal Processing, vol. 2002, no. 11, p. 1165-1173, 2002.
L. Girin, J.-L. Schwartz et G. Feng, “Audio-visual enhancement of speech in noise,” Journal of the Acoustical Society of America, vol. 109, no. 6, p. 3007-3020, 2001.
L. Girin, G. Feng et J.-L. Schwartz, “Débruitage de parole par un filtrage utilisant l’image du locuteur : une étude de faisabilité,” Traitement du Signal, vol. 13, no. 4, p. 319-334, 1996.

Book chapters

L. Girin, X. Li et S. Gannot, “Audio source separation into the wild,” in Multimodal behavior analysis in the wild, X. Alameda-Pineda, E. Ricci et N. Sebe, éd., Elsevier Academic Press, 2018, p. 58-78.
G. Feng et L. Girin, “Principles of speech coding,” in Spoken Language Processing, J. Mariani, éd., ISTE Ltd / John Wiley et Sons, London, 2009.
G. Feng et L. Girin, “Principes du codage de la parole,” in Traitement automatique du langage parlé, Tome 1 : Analyse, synthèse et codage de la parole, J. Mariani, éd., Hermès-Lavoisier, Paris, 2002.

International conferences with reviews

X. Lin, X. Bie, S. Leglaive, L. Girin et X. Alameda-Pineda, “Speech modeling with a hierarchical Transformer dynamical VAE,” in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Rhodes, Greece, 2023. https://hal.science/hal-04132313v1
X. Lin, S. Leglaive, L. Girin et X. Alameda-Pineda, “Unsupervised speech enhancement with deep dynamical generative speech and noise models,” in Conference of the International Speech Communication Association (INTERSPEECH), Dublin, Ireland, 2023. https://hal.science/hal-04132312v1
M.-A. Georges, J. Diard, L. Girin, J.-L. Schwartz et T. Hueber, “Repeat after me: Self-supervised learning of acoustic-to-articulatory mapping by vocal imitation,” in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Singapore, 2022. https://hal.science/hal-03688189v1
B. Stephenson, L. Besacier, L. Girin et T. Hueber, “BERT, can HE predict contrastive focus? Predicting and controlling prominence in neural TTS using a language model,” in Conference of the International Speech Communication Association (INTERSPEECH), Incheon, Korea, 2022. https://hal.science/hal-03791472v1
X. Bie, L. Girin, S. Leglaive, T. Hueber et X. Alameda-Pineda, “A benchmark of dynamical variational autoencoders applied to speech spectrogram modeling,” in Conference of the International Speech Communication Association (INTERSPEECH), Brno, Czech Republic, 2021. https://hal.science/hal-03295657v1
M.-A. Georges, L. Girin, J.-L. Schwartz et T. Hueber, “Learning a robust speech representation with an articulatory-regularized variational autoencoder,” in Conference of the International Speech Communication Association (INTERSPEECH), Brno, Czech Republic, 2021. https://hal.science/hal-03373252v1
P.-A. Grumiaux, S. Kitic, L. Girin et A. Guérin, “Improved feature extraction for CRNN-based multiple sound source localization,” in European Signal Processing Conference (EUSIPCO), Dublin, Ireland, 2021. https://hal.science/hal-03537334v1
P.-A. Grumiaux, S. Kitic, P. Srivastava, L. Girin et A. Guérin, “SALADnet: self-attentive multisource localization in the Ambisonics domain,” in IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), New Paltz, USA, 2021. https://hal.science/hal-03537340v1
B. Stephenson, L. Besacier, L. Girin et T. Hueber, “Alternate endings: Improving prosody for incremental neural TTS with predicted future text input,” in Conference of the International Speech Communication Association (INTERSPEECH), Brno, Czech Republic, 2021. https://hal.science/hal-03372802v1
M.-A. Georges, P. Badin, J. Diard, L. Girin, J.-L. Schwartz et T. Hueber, “Towards an articulatory-driven neural vocoder for speech synthesis,” in International Seminar on Speech Production (ISSP), New Haven, CT, USA, 2020. https://hal.science/hal-03184762v1
P.-A. Grumiaux, S. Kitic, L. Girin et A. Guérin, “High-resolution speaker counting in reverberant rooms using CRNN with Ambisonics features,” in European Signal Processing Conference (EUSIPCO), Amsterdam, The Netherlands, 2020. https://hal.science/hal-03537323v1
P.-A. Grumiaux, S. Kitic, L. Girin et A. Guérin, “Multichannel source counting with a CRNN: Analysis of the performance,” in Forum Acusticum, Lyon, France, 2020. https://hal.science/hal-03235360v1
S. Leglaive, X. Alameda-Pineda, L. Girin et R. Horaud, “A recurrent variational autoencoder for speech enhancement,” in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Barcelona, Spain, 2020. https://hal.science/hal-02329000v2
B. Stephenson, L. Besacier, L. Girin et T. Hueber, “What the future brings: Investigating the impact of lookahead for incremental neural TTS,” in Conference of the International Speech Communication Association (INTERSPEECH), Shanghai, China, 2020. https://hal.science/hal-02962234v1
X. Alameda-Pineda, S. Arias, Y. Ban, G. Delorme, L. Girin, R. Horaud, X. Li, B. Morgue et G. Sarrazin, “Audio-visual variational fusion for multi-person tracking with robots,” in ACM International Conference on Multimedia (ACMMM), Nice, France, 2019, p. 1059-1061. https://hal.science/hal-02354514v1
R. Frisch, M. Faix, E. Mazer, J. Droulez et L. Girin, “Bayesian time-domain multiple sound source localization for a stochastic machine,” in European Signal Processing Conference (EU- SIPCO), A Coruna, Spain, 2019. https://hal.science/hal-02377220v1
L. Girin, F. Roche, S. Leglaive et T. Hueber, “Notes on the use of variational autoencoders for speech and audio spectrogram modeling,” in International Conference on Digital Audio Effects (DAFx), Birmingham, UK, 2019. https://hal.science/hal-02349385v1
S. Leglaive, L. Girin et R. Horaud, “Semi-supervised multichannel speech enhancement with variational autoencoders and non-negative matrix factorization,” in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Brighton, UK, 2019. https://hal.science/hal-02005102v2
S. Leglaive, U. Simsekli, A. Liutkus, L. Girin et R. Horaud, “Speech enhancement with variational autoencoders and alpha-stable distributions,” in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Brighton, UK, 2019. https://hal.science/hal-02005106v1
F. Roche, T. Hueber, S. Limier et L. Girin, “Autoencoders for music sound modeling: a comparison of linear, shallow, deep, recurrent and variational models,” in Sound and Music Computing Conference (SMC), Malaga, Spain, 2019. https://hal.science/hal-02349406v1
Y.-T. Ban, X. Li, X. Alameda-Pineda, L. Girin et R. Horaud, “Accounting for room acoustics in audio-visual multi-speaker tracking,” in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Calgary, Canada, 2018. https://hal.science/hal-01718114v1
S. Leglaive, L. Girin et R. Horaud, “A variance modeling framework based on variational autoencoders for speech enhancement,” in IEEE International Workshop on Machine Learning for Signal Processing (MLSP), Aalborg, Denmark, 2018. https://hal.science/hal-01832826v1
X. Li, Y. Ban, L. Girin, X. Alameda-Pineda et R. Horaud, “A cascaded multi-speaker localization and tracking system,” in IEEE Workshop on Acoustic Signal Enhancement (IWAENC) – LOCATA Challenge Workshop, Tokyo, Japan, 2018.
X. Li, S. Gannot, L. Girin et R. Horaud, “Multisource MINT using convolutive transfer function,” in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Calgary, Canada, 2018. https://hal.science/hal-01718106v1
X. Li, B. Mourgue, L. Girin, S. Gannot et R. Horaud, “Online localization of multiple moving speakers in reverberant environments,” in IEEE Workshop on Sensor Array and Multichannel Signal Processing (SAM), Sheffield, UK, 2018. https://hal.science/hal-01795462v1
Q. Nguyen, L. Girin, G. Bailly, F. Elisei et D.-C. Nguyen, “Autonomous sensorimotor learning for sound source localization by a humanoid robot,” in IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) – Workshop on Crossmodal Learning for Intelligent Robotics, Madrid, Spain, 2018. https://hal.science/hal-01921882v1
Y.-T. Ban, L. Girin, X. Alameda-Pineda et R. Horaud, “Exploiting the complementarity of audio and visual data in multi-speaker tracking,” in International Conference on Computer Vision, Workshop on Computer Vision for Audio-Visual Media, Venice, Italy, 2017. https://hal.science/hal-01577965v1
M. Fontaine, A. Liutkus, L. Girin et R. Badeau, “Explaining the parameterized Wiener filter with Alpha-stable processes,” in IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), New Paltz, NJ, USA, 2017. https://hal.science/hal-01548508v1
R. Frisch, R. Laurent, M. Faix, L. Girin, L. Fesquet, A. Lux, J. Droulez, P. Bessière et E. Mazer, “A Bayesian stochastic machine for sound source localization,” in IEEE International Conference on Rebooting Computing, Washington, DC, USA, 2017. https://hal.science/hal-01644346v1
L. Girin et R. Badeau, “On the use of latent mixing filters in audio source separation,” in International Conference on Latent Variable Analysis and Signal Separation (LVA/ICA), Grenoble, France, 2017. https://hal.science/hal-01400965v1
L. Girin, T. Hueber et X. Alameda-Pineda, “Adaptation of a Gaussian mixture regressor to a new input distribution: Extending the C-GMR framework,” in International Conference on Latent Variable Analysis and Signal Separation (LVA/ICA), Grenoble, France, 2017. https://hal.science/hal-01646098v1
D. Kounades-Bastian, L. Girin, X. Alameda-Pineda, S. Gannot et R. Horaud, “An EM algorithm for joint source separation and diarization of multichannel convolutive speech mixtures,” in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), New Orleans, Louisiana, USA, 2017. https://hal.science/hal-01430761v1
D. Kounades-Bastian, L. Girin, X. Alameda-Pineda, S. Gannot et R. Horaud, “Exploiting the intermittency of speech for joint separation and diarization,” in IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), New Paltz, NJ, USA, 2017. https://hal.science/hal-01568813v1
X. Li, L. Girin et R. Horaud, “An EM algorithm for audio source separation based on the convolutive transfer function,” in IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), New Paltz, NJ, USA, 2017. https://hal.science/hal-01568818v1
X. Li, L. Girin et R. Horaud, “Audio source separation based on convolutive transfer function and frequency-domain Lasso optimization,” in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), New Orleans, Louisiana, USA, 2017. https://hal.science/hal-01430754v1
D. Kounades-Bastian, L. Girin, X. Alameda-Pineda, S. Gannot et R. Horaud, “An inverse-Gamma source variance prior with factorized parameterization for audio source separation,” in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Shanghai, China, 2016, p. 136-140. https://hal.science/hal-01253169v1
P. Laffitte, D. Sodoyer, C. Tatkeu et L. Girin, “Deep neural networks for automatic detection of screams and shouted speech in subway trains,” in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Shanghai, China, 2016, p. 6460-6464. https://hal.science/hal-01385272v1
X. Li, L. Girin, F. Badeig et R. Horaud, “Reverberant sound localization with a robot head based on direct-path relative transfer function,” in IEEE/RSJ International Conference on Intel- ligent Robots and Systems (IROS), Daejeon, South-Korea, 2016. https://hal.science/hal-01349771v1
X. Li, L. Girin, S. Gannot et R. Horaud, “Non-stationary noise power spectral density estimation based on regional statistics,” in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Shanghai, China, 2016, p. 181-185. https://hal.science/hal-01250892v1
X. Li, R. Horaud, L. Girin et S. Gannot, “Voice activity detection based on statistical likelihood ratio with adaptive thresholding,” in IEEE International Workshop on Acoustic Signal Enhancement (IWAENC), Xian, China, 2016. https://hal.science/hal-01349776v1
F. Bocquelet, T. Hueber, L. Girin, C. Savariaux et B. Yvert, “Real-time control of a DNN- based articulatory synthesizer for silent speech conversion: a pilot study,” in Conference of the International Speech Communication Association (INTERSPEECH), Dresden, Germany, 2015, p. 2405-2409. https://hal.science/hal-01726265v1
D. Kounades-Bastian, L. Girin, X. Alameda-Pineda, S. Gannot et R. Horaud, “A variational EM algorithm for the separation of moving sound sources,” in IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), New Paltz, NJ, USA, 2015, p. 1-5. Best Student Paper Award. https://hal.science/hal-01169764v2
X. Li, L. Girin, R. Horaud et S. Gannot, “Estimation of relative transfer function in the presence of stationary noise based on segmental power spectral density matrix subtraction,” in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Brisbane, Australia, 2015, p. 320-324. https://hal.science/hal-01119186v1
X. Li, R. Horaud, L. Girin et S. Gannot, “Local relative transfer function for sound source localization,” in European Signal Processing Conference (EUSIPCO), Nice, France, 2015, p. 399- 403. https://hal.science/hal-01163675v1
F. Bocquelet, T. Hueber, L. Girin, P. Badin et B. Yvert, “Robust articulatory speech synthesis using deep neural networks for BCI applications,” in Conference of the International Speech Communication Association (INTERSPEECH), Singapore, 2014. https://hal.science/hal-01228891v1
A. Deleforge, V. Drouard, L. Girin et R. Horaud, “Mapping sounds onto images using binaural spectrograms,” in European Signal Processing Conference (EUSIPCO), Lisbon, Portugal, 2014, p. 2470-2474. https://hal.science/hal-01019287v1
M. Janvier, X. Alameda-Pineda, L. Girin et R. Horaud, “Sound representation and classification benchmark for domestic robots,” in IEEE International Conference on Robotics and Automation (ICRA), Hong-Kong, China, 2014, p. 6285-6292. https://hal.science/hal-00952092v1
S. Kırbız, A. Ozerov, A. Liutkus et L. Girin, “Perceptual coding-based informed source separation,” in European Signal Processing Conference (EUSIPCO), Lisbon, Portugal., 2014, p. 959-963. https://hal.science/hal-01016314v1
M. Janvier, R. Horaud, L. Girin, F. Berthommier, L.-J. Boë, C. Kemp, A. Rey et T. Legou, “Supervised classification of baboon vocalizations,” in Neural Information Processing Scaled for Bioacoustics (NIPS4B), Lake Tahoe, Nevada, USA, 2013. https://hal.science/hal-00910104v1
S. Zhang, L. Girin et A. Liutkus, “Informed source separation from compressed mixtures using spatial Wiener filter and quantization noise estimation,” in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Vancouver, Canada, 2013, p. 61-65. https://hal.science/hal-00940328v1
F. Berthommier, L. Girin et L.-J. Boë, “A simple hybrid acoustic and morphologically constrained technique for the synthesis of stop consonants in various vocalic contexts,” in Conference of the International Speech Communication Association (INTERSPEECH), Portland, USA, 2012. https://hal.science/hal-00807519v1
T. Gerber, M. Dutasta, L. Girin et C. Févotte, “Professionally-produced music separation guided by covers,” in International Society for Music Information Retrieval Conference (ISMIR), Porto, Portugal, 2012. https://hal.science/hal-00807027v1
M. Janvier, X. Alameda-Pineda, L. Girin et R. Horaud, “Sound-event recognition with a companion humanoid,” in IEEE/RAS International Conference on Humanoid Robots (Huma- noids), Osaka, Japan, 2012. https://hal.science/hal-00768767v1
A. Liutkus, S. Gorlow, N. Sturmel, S. Zhang, L. Girin, R. Badeau, L. Daudet, S. Marchand et G. Richard, “Informed audio source separation: a comparative study,” in European Signal Processing Conference (EUSIPCO), Bucharest, Romania, 2012. https://hal.science/hal-00809525v1
S. Marchand, R. Badeau, C. Baras, L. Daudet, D. Fourer, L. Girin, S. Gorlow, A. Liutkus, J. Pinel, G. Richard et al., “DReaM: a novel system for joint source separation and multitrack coding,” in Audio Engineering Society (AES) Convention, San Francisco, USA, 2012. https://hal.science/hal-00809503v1
N. Sturmel, L. Daudet et L. Girin, “Phase-based informed source separation for active listening of music,” in International Conference on Digital Audio Effects (DAFx), York, UK, 2012. https://hal.science/hal-00807001v1
N. Sturmel, A. Liutkus, J. Pinel, L. Girin, S. Marchand, G. Richard, R. Badeau et L. Daudet, “Linear mixing models for active listening of music productions in realistic studio conditions,” in Audio Engineering Society (AES) Convention, Budapest, Hungary, 2012. Best Paper Award. https://hal.science/hal-00790783v1
F. Ben Ali, L. Girin et S. Djaziri-Larbi, “A long-term harmonic plus noise model for speech signals,” in Conference of the International Speech Communication Association (INTERSPEECH), Florence, Italy, 2011, p. 53-56. https://hal.science/hal-00695752v1
L. Girin et J. Pinel, “Informed audio source separation from compressed linear stereo mixtures,” in Audio Engineering Society (AES) Conference, Ilmenau, Germany, 2011. https://hal.science/hal-00695724v1
J. Pinel et L. Girin, “A high-rate data hiding technique for audio signals based on IntMDCT quantization,” in International Conference on Digital Audio Effects (DAFx), Paris, France, 2011, p. 353-356. https://hal.science/hal-00695759v1
J. Pinel et L. Girin, “Sparsification of audio signals using the MDCT/IntMDCT and a psychoacoustic model. Application to informed audio source separation,” in Audio Engineering Society (AES) Conference, Ilmenau, Germany, 2011. https://hal.science/hal-00695730v1
S. Zhang et L. Girin, “An informed source separation system for speech signals,” in Conference of the International Speech Communication Association (INTERSPEECH), Florence, Italy, 2011, p. 573-576. https://hal.science/hal-00695758v1
F. Ben Ali, L. Girin et S. Djaziri-Larbi, “Long-term modelling of parameters trajectories for the harmonic plus noise model of speech signals,” in International Congress on Acoustics (ICA), Sydney, Australia, 2010.
S. Marchand, B. Mansencal et L. Girin, “Interactive music with active audio CDs,” in International Symposium on Computer Music Modeling and Retrieval (CMMR), Malaga, Spain, 2010. https://hal.science/hal-00502792v1
M. Mazuel, D. David et L. Girin, “Linking motion sensors and digital signal processing for real- time musical transformations,” in Int. Conf. on Haptic Audio Interaction Design (HAID) (demo session), Copenhagen, Denmark, 2010.
M. Parvaix et L. Girin, “Informed source separation of underdetermined instantaneous stereo mixtures using source index embedding,” in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Dallas, USA, 2010, p. 245-248. https://hal.science/hal-00486804v1
M. Parvaix, L. Girin, L. Daudet, J. Pinel et C. Baras, “Hybrid coding/indexing strategy for informed source separation of linear instantaneous under-determined audio mixtures,” in Interna- tional Congress on Acoustics (ICA), Sydney, Australia, 2010.
J. Pinel, L. Girin, C. Baras et M. Parvaix, “A high-capacity watermarking technique for audio signals based on MDCT-domain quantization,” in International Congress on Acoustics (ICA), Sydney, Australia, 2010.
M. Parvaix, L. Girin et J.-M. Brossier, “A watermarking-based method for single-channel audio source separation,” in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Taipei, Taiwan, 2009, p. 101-104.
M. Firouzmand et L. Girin, “Long-term flexible 2D cepstral modeling of speech spectral amplitudes,” in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Las Vegas, USA, 2008, p. 3937-3940.
K. Hermus, L. Girin, H. Van hamme et S. Irhimeh, “Estimation of the voicing cut-off frequency contour of natural speech based on harmonic and aperiodic energies,” in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Las Vegas, USA, 2008, p. 4473-4476.
A. Aubrey, B. Rivet, Y. Hicks, L. Girin, J. Chambers et C. Jutten, “Two novel visual voice activity detectors based on appearance models and retinal filtering,” in European Signal Processing Conference (EUSIPCO), Poznan, Poland, 2007, p. 2409-2413.
D. Beautemps, L. Girin, N. Aboutabit, G. Bailly, L. Besacier, G. Breton, T. Burger, A. Caplier, M.-A. Cathiard, D. Chêne et f, “TELMA: telephony for the hearing-impaired people. from models to user tests,” in International Conference on Assistive Technologies (ASSISTH), Toulouse, France, 2007, p. 201-208.
L. Girin, “Long-term quantization of speech LSF parameters,” in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Honolulu, Hawaii, USA, 2007.
B. Rivet, A. J. Aubrey, L. Girin, Y. Hicks, C. Jutten et J. Chambers, “Development and comparison of two approaches for visual speech analysis with application to voice activity detection.,” in International Conference on Audio-Visual Speech Processing (AVSP), Hilvarenbeek, The Netherlands, 2007, p. 14.
B. Rivet, L. Girin, C. Serviere, D.-T. Pham et C. Jutten, “Audiovisual speech source separation : a regularization method based on visual voice activity detection,” in International Conference on Audio-Visual Speech Processing (AVSP), Hilvarenbeek, The Netherlands, 2007.
B. Rivet, L. Girin, C. Serviere, D.-T. Pham et C. Jutten, “Using a visual voice activity detector to regularize the permutations in blind separation of convolutive speech mixtures,” in IEEE International Conference on Digital Signal Processing (DSP), Cardiff, Wales, 2007, p. 223-226.
L. Girin, “Theoretical and experimental bases of a new method for accurate separation of harmonic and noise components of speech signals,” in European Signal Processing Conference (EUSIPCO), Florence, Italy, 2006, p. 1-5.
D. Sodoyer, B. Rivet, L. Girin, J.-L. Schwartz et C. Jutten, “An analysis of visual speech information applied to voice activity detection,” in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Toulouse, France, 2006.
M. Firouzmand et L. Girin, “Perceptually weighted long term modeling of sinusoidal speech amplitude trajectories,” in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Philadelphia, USA, 2005.
M. Firouzmand, L. Girin et S. Marchand, “Comparing several models for perceptual long-term modeling of amplitude and phase trajectories of sinusoidal speech,” in Conference of the International Speech Communication Association (INTERSPEECH), Lisbon, Portugal, 2005, p. 357-360.
M. Raspaud, S. Marchand et L. Girin, “A generalized polynomial and sinusoidal model for partial tracking and time stretching,” in International Conference on Digital Audio Effects (DAFx), Madrid, Spain, 2005, p. 24-29.
B. Rivet, L. Girin et C. Jutten, “Solving the indeterminations of blind source separation of convolutive speech mixtures,” in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Philadelphia, USA, 2005.
D. Beautemps, T. Burger et L. Girin, “Characterizing and classifying cued speech vowels from labial parameters,” in International Conference on Spoken Language Processing (ICSLP), Jeju, South Korea, 2004.
L. Girin, M. Firouzmand et S. Marchand, “Long-term modeling of phase trajectories within the speech sinusoidal model framework.,” in International Conference on Spoken Language Processing (ICSLP), Jeju, South-Korea, 2004.
L. Girin et S. Marchand, “Watermarking of speech signals using the sinusoidal model and frequency modulation of the partials,” in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Montreal, Canada, 2004.
B. Rivet, L. Girin, C. Jutten et J.-L. Schwartz, “Using audiovisual speech processing to improve the robustness of the separation of convolutive speech mixtures,” in IEEE Workshop on Multimedia Signal Processing (MMSP), Siena, Italy, 2004, p. 47-50.
L. Girin, “Pure audio McGurk effect,” in International Conference on Audio-Visual Speech Processing (AVSP), Saint-Jorioz, France, 2003.
L. Girin, S. Marchand, J. Di Martino, A. Robel et G. Peeters, “Comparing the order of a polynomial phase model for the synthesis of quasi-harmonic audio signals,” in IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), New Paltz, USA, 2003, p. 193-196.
D. Sodoyer, L. Girin, C. Jutten et J.-L. Schwartz, “Speech extraction based on ICA and audio-visual coherence,” in IEEE International Symposium on Signal Processing and Its Applications (ISSPA), Paris, France, 2003.
D. Sodoyer, L. Girin, C. Jutten et J.-L. Schwartz, “Extracting an av speech source from a mixture of signals,” in European Conference on Speech Communication and Technology (EUROSPEECH), Geneva, Switzerland, 2003.
D. Sodoyer, L. Girin, C. Jutten et J.-L. Schwartz, “Further experiments on audio-visual speech source separation,” in International Conference on Audio-Visual Speech Processing (AVSP), Saint-Jorioz, France, 2003.
D. Sodoyer, L. Girin, C. Jutten et J.-L. Schwartz, “Audio-visual speech source separation,” in International Conference on Spoken Language Processing (ICSLP), Denver, USA, 2002.
L. Girin, A. Allard et J.-L. Schwartz, “Speech signals separation: a new approach exploiting the coherence of audio and visual speech,” in IEEE Workshop on Multimedia Signal Processing (MMSP), Cannes, France, 2001, p. 631-636.
E. Foucher, G. Feng et L. Girin, “A preliminary study of an audio-visual speech coder: using video parameters to reduce an LPC vocoder bit rate,” in European Signal Processing Conference (EUSIPCO), Rhodes, Greece, 1998, p. 1-4.
E. Foucher, L. Girin et G. Feng, “An audiovisual speech coder using vector quantization to exploit the audio/video correlation,” in International Conference on Audio-Visual Speech Processing (AVSP), Sydney, Australia, 1998.
L. Girin, G. Feng et J.-L. Schwartz, “Fusion of auditory and visual information for noisy speech enhancement: a preliminary study of vowel transitions,” in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Seattle, USA, 1998, p. 1005-1008.
L. Girin, E. Foucher et G. Feng, “An audio-visual distance for audio-visual speech vector quantization,” in IEEE Workshop on Multimedia Signal Processing (MMSP), Los Angeles, USA, 1998.
L. Girin, L. Varin, G. Feng et J.-L. Schwartz, “A signal processing system for having the sound pop-out in noise thanks to the image of the speaker’s lips: new advances using multi- layer perceptrons,” in International Conference on Spoken Language Processing (ICSLP), Sydney, Australia, 1998.
L. Girin, L. Varin, G. Feng et J.-L. Schwartz, “Audiovisual speech enhancement: new advances using multi-layer perceptrons,” in IEEE Workshop on Multimedia Signal Processing (MMSP), Los Angeles, USA, 1998.
L. Girin, G. Feng et J.-L. Schwartz, “Noisy speech enhancement by fusion of auditory and visual information: a study of vowel transitions,” in European Conference on Speech Communication and Technology (EUROSPEECH), Rhodes, Greece, 1997.
L. Girin, J.-L. Schwartz et G. Feng, “Can the visual input make the audio signal pop out in noise ? A first study of the enhancement of noisy VCV acoustic sequences by audio-visual fusion,” in International Conference on Audio-Visual Speech Processing (AVSP), Rhodes, Greece, 1997.
L. Girin, G. Feng et J.-L. Schwartz, “Noisy speech enhancement with filters estimated from the speaker’s lips,” in European Conference on Speech Communication and Technology (EUROSPEECH), Madrid, Spain, 1995.

Moteur de recherche

Pages personnelles