GIRIN
Laurent
professeur Grenoble-INP
Publications


Papers in international journals with review


[J36]. X. Li, S. Leglaive, L. Girin, and R. Horaud, “Audio noise power spectral density estimation using long short-term memory,IEEE Signal Processing Letters, vol. 26, no. 6, pp. 918–922, 2019.

[J35]. X. Li, L. Girin, S. Gannot, and R. Horaud, “Multichannel online dereverberation based on spectral magnitude inverse filtering,IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 27, no. 9, pp. 1365–1377, 2019.

[J34]. X. Li, Y. Ban, L. Girin, X. Alameda-Pineda, and R. Horaud, “Online localization and tracking of multiple moving speakers in reverberant environments,IEEE Journal of Selected Topics in Signal Processing, vol. 13, no. 1, pp. 88–103, 2019.

[J33]. X. Li, L. Girin, and R. Horaud, “Expectation-Maximization for speech source separation using the convolutive transfer function,CAAI Transactions on Intelligence Technology, vol. 4, no. 1, pp. 47–53, 2019.

[J32]. X. Li, L. Girin, S. Gannot, and R. Horaud, “Multichannel speech separation and enhancement using the convolutive transfer function,IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 27, no. 3, pp. 645–659, 2019.

[J31]. P. Laffitte, Y. Wang, D. Sodoyer, and L. Girin, Assessing the performances of different neural networks architectures for the detection of screams and shouts in public transportation, Expert Systems With Applications, vol. 117, pp. 29–41, 2019.

[J30]. X. Li, S. Gannot, L. Girin, and R. Horaud, Multichannel identification and nonnegative equalization for dereverberation and noise reduction based on the convolutive transfer function, IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 26, no. 10, pp. 1755–1768, 2018.

[J29]. F. Bocquelet, T. Hueber, L. Girin, S. Chabardès, and B. Yvert, Key considerations in designing a speech brain-computer interface,” Journal of Physiology - Paris, vol. 110, no. 4, pp. 392401, 2017.

[J28]. D. Fabre, T. Hueber, L. Girin, X. Alameda-Pineda, and P. Badin, “Automatic animation of an articulatory tongue model from ultrasound images of the vocal tract,” Speech Communication, vol. 93, no. 9, pp. 63–75, 2017.

[J27]. X. Li, L. Girin, R. Horaud, and S. Gannot, “Multiple-speaker localization based on direct-path features and likelihood maximization with spatial sparsity regularization,” IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 25, no. 10, pp. 1997–2012, 2017.

[J26]. L. Girin, T. Hueber, and X. Alameda-Pineda, “Extending the cascaded Gaussian mixture regression framework for cross-speaker acoustic-articulatory mapping,” IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 25, no. 3, pp. 662–673, 2017.

[J25]. F. Ben Ali, S. Djaziri-Larbi, and L. Girin, “A low bit-rate speech codec based on a long-term harmonic plus noise model,” Journal of the Audio Engineering Society, vol. 64, no. 11, pp. 1–14, 2016.

[J24]. F. Bocquelet, T. Hueber, L. Girin, C. Savariaux, and B. Yvert, “Real-time control of an articulatory-based speech synthesizer for brain-computer interfaces,” PLOS Computational Biology, vol. 12, no. 11, 28p, 2016.

[J23]. D. Kounades-Bastian, L. Girin, X. Alameda-Pineda, S. Gannot, and R. Horaud, “A variational EM algorithm for the separation of time-varying convolutive audio mixtures,” IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 24, no. 8, pp. 1408–1423, 2016.

[J22]. X. Li, L. Girin, R. Horaud, and S. Gannot, “Estimation of the direct-path relative transfer function for supervised sound source localization,” IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 24, no. 11, pp. 2171–2186, 2016.

[J21]. A. Deleforge, R. Horaud, Y. Y. Schechner, and L. Girin, “Co-localization of audio sources in images using binaural features and locally-linear regression,” IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 23, no. 4, pp. 718–731, 2015.

[J20]. T. Hueber, L. Girin, X. Alameda-Pineda, and G. Bailly, “Speaker-adaptive acoustic-articulatory inversion using cascaded Gaussian mixture regression,” IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 23, no. 12, pp. 2246–2259, 2015.

[J19]. J. Pinel, L. Girin, and C. Baras, “A high-rate data hiding technique for uncompressed audio signals,” Journal of the Audio Engineering Society, vol. 62, no. 6, pp. 400–413, 2014.

[J18]. K. Grabski, P. Tremblay, V. Gracco, L. Girin, and M. Sato, “A mediating role of the auditory dorsal pathway in selective adaptation to speech: a state-dependent transcranial magnetic stimulation study,” Brain Research, vol. 1515, no. 5, pp. 55–65, 2013.

[J17]. S. Zhang and L. Girin, “Fast and accurate direct MDCT-to-DFT conversion with arbitrary window functions,” IEEE Transactions on Audio, Speech, and Language Processing, vol. 21, no. 3, pp. 567–578, 2013.

[J16]. A. Liutkus, J. Pinel, R. Badeau, L. Girin, and G. Richard, “Informed source separation through spectrogram coding and data embedding,” Signal Processing, vol. 92, no. 8, pp. 1937–1949, 2012.

[J15]. S. Marchand, B. Mansencal, and L. Girin, “Interactive music with active audio CDs,” Lecture Notes in Computer Science, vol. 6684, pp. 31–50, 2011, (extended version of selected paper from CMMR 2010).

[J14]. M. Parvaix and L. Girin, “Informed source separation of linear instantaneous under-determined audio mixtures by source index embedding,” IEEE Transactions on Audio, Speech, and Language Processing, vol. 19, no. 6, pp. 1721–1733, 2011.

[J13]. L. Girin, “Adaptive long-term coding of LSF parameters trajectories for large-delay/very-to ultra-low bit-rate speech coding,” EURASIP Journal on Audio, Speech, and Music Processing, vol. 2010, no. Article ID 597039, 2010.

[J12]. M. Parvaix, L. Girin, and J.-M. Brossier, “A watermarking-based method for informed source separation of audio signals with a single sensor,” IEEE Transactions on Audio, Speech, and Language Processing, vol. 18, no. 6, pp. 1464–1475, 2010.

[J11]. D. Sodoyer, B. Rivet, L. Girin, C. Savariaux, J.-L. Schwartz, and C. Jutten, “A study of lip movements during spontaneous dialog and its application to voice activity detection,” Journal of the Acoustical Society of America, vol. 125, no. 2, pp. 1184–1196, 2009.

[J10]. L. Girin, M. Firouzmand, and S. Marchand, “Perceptual long-term variable-rate sinusoidal modeling of speech,” IEEE Transactions on Audio, Speech, and Language Processing, vol. 15, no. 3, pp. 851–861, 2007.

[J9]. B. Rivet, L. Girin, and C. Jutten, “Log-Rayleigh distribution: a simple and efficient statistical representation of log-spectral coefficients,” IEEE Transactions on Audio, Speech, and Language Processing, vol. 15, no. 3, pp. 796–802, 2007.

[J8]. B. Rivet, L. Girin, and C. Jutten, “Mixing audiovisual speech processing and blind source separation for the extraction of speech signals from convolutive mixtures,” IEEE Transactions on Audio, Speech, and Language Processing, vol. 15, no. 1, pp. 96–108, 2007.

[J7]. B. Rivet, L. Girin, and C. Jutten, “Visual voice activity detection as a help for speech source separation from convolutive mixtures,” Speech Communication, vol. 49, no. 7, pp. 667–677, 2007.

[J6]. G. Bailly, V. Attina, C. Baras, P. Bas, S. Baudry, D. Beautemps, R. Brun, J.-M. Chassery, F. Davoine, F. Elisei, G. Gibert, L. Girin, et al., “ARTUS: Synthesis and audiovisual watermarking of the movements of a virtual agent interpreting subtitles using cued speech for deaf televiewers,” AMSE Modelling, Measurement and Control-C, vol. 67, no. 2, pp. 177–187, 2006.

[J5]. L. Girin, “Joint matrix quantization of face parameters and LPC coefficients for low bit rate audiovisual speech coding,” IEEE Transactions on Speech and Audio Processing, vol. 12, no. 3, pp. 265–276, 2004.

[J4]. D. Sodoyer, L. Girin, C. Jutten, and J.-L. Schwartz, “Developing an audio-visual speech source separation algorithm,” Speech Communication, vol. 44, no. 1, pp. 113–125, 2004.

[J3]. D. Sodoyer, J.-L. Schwartz, L. Girin, J. Klinkisch, and C. Jutten, “Separation of audio-visual speech sources: a new approach exploiting the audio-visual coherence of speech stimuli,” EURASIP Journal on Applied Signal Processing, vol. 2002, no. 11, pp. 1165–1173, 2002.

[J2]. L. Girin, J.-L. Schwartz, and G. Feng, “Audio-visual enhancement of speech in noise,” Journal of the Acoustical Society of America, vol. 109, no. 6, pp. 3007–3020, 2001.

[J1]. L. Girin, G. Feng, and J.-L. Schwartz, “Débruitage de parole par un filtrage utilisant l’image du locuteur : une étude de faisabilité,” Traitement du Signal, vol. 13, no. 4, pp. 319–334, 1996.

Book chapters

[Ch3]. L. Girin, X. Li, and S. Gannot, Audio source separation into the wild,” in Multimodal Behavior Analysis in the Wild, X. Alameda-Pineda, E. Ricci, N. Sebe, Eds., Elsevier Academic Press, pp. 58-78, 2018.

[Ch2]. G. Feng and L. Girin, “Principles of speech coding,” in Spoken Language Processing, J. Mariani, Ed., ISTE Ltd / John Wiley and Sons, London, 2009.

[Ch1]. G. Feng and L. Girin, “Principes du codage de la parole,” in Traitement automatique du langage parlé, Tome 1 : Analyse, synthèse et codage de la parole, J. Mariani, Ed., Hermès-Lavoisier, Paris, 2002.

 

International conferences with review

[C88]. L. GirinL. GirinL. Girin, F. Roche, S. Leglaive, and T. Hueber, Notes on the use of variational autoencoders for speech and audio spectrogram modeling,” in International Conference on Digital Audio Effects (DAFx), Birmingham, UK, 2019.

[C87]. F. Roche, T. Hueber, S. Limier, and L. Girin, Autoencoders for music sound encoding: A comparison of linear, shallow, deep, recurrent and variational models,” in Sound and Music Computing Conference (SMC), Malaga, Spain, 2019.

[C86]. R. Frisch, M. Faix, E. Mazer, J. Droulez, and L. Girin, Bayesian time-domain multiple sound source localization for a stochastic machine,” in European Signal Processing Conference (EUSIPCO), A Coruna, Spain, 2019.

[C85]. S. Leglaive, L. Girin, and R. Horaud,Semi-supervised multichannel speech enhancement with variational autoencoders and non-negative matrix factorization,” in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Brighton, UK, 2019.

[C84]. S. Leglaive, U. Simsekli, A. Liutkus, L. Girin, and R. Horaud,speech enhancement with variational autoencoders and alpha-stable distributions,” in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Brighton, UK, 2019.

[C83]. X. Li, Y. Ban, L. Girin, X. Alameda-Pineda, and R. Horaud,A cascaded multiple-speaker localization and tracking system,in IEEE International Workshop on Acoustic Signal Enhancement (IWAENC) - LOCATA Challenge Workshop, Tokyo, Japan, 2018.

[C82]. Q.V. Nguyen, L. Girin, G. Bailly, F. Elisei, and D.C. Nguyen,Autonomous sensorimotor learning for sound source localization by a humanoid robot,in IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) - Workshop on Crossmodal Learning for Intelligent Robotics, Madrid, Spain, 2018.

[C81]. S. Leglaive, L. Girin, and R. Horaud,A variance modeling framework based on variational auto-encoders for speech enhancement,” in IEEE Workshop on Machine Learning for Signal Processing (MLSP), Aalborg, Denmark, 2018.

[C80]. X. Li, B. Mourgue, L. Girin, S. Gannot, and R. Horaud,Online localization of multiple moving speakers in reverberant environments,” in IEEE Workshop on Sensor Array and Multichannel Signal Processing (SAM), Sheffield, UK, 2018.

[C79]. X. Li, S. Gannot, L. Girin, and R. Horaud,Multisource MINT using the convolutive transfer function,” in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Calgary, Canada, 2018.

[C78]. Y. Ban, X. Li, X. Alameda-Pineda, L. Girin, and R. Horaud,Accounting for room acoustics in audio-visual multi-speaker tracking,” in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Calgary, Canada, 2018.

[C77]. R. Frisch, R. Laurent, M. Faix, L. Girin, L. Fesquet, A. Lux, J. Droulez, P. Bessière, and E. Mazer, A Bayesian stochastic machine for sound source localization,” in IEEE International Conference on Rebooting Computing, Washington, DC, USA, 2017.

[C76]. Y. Ban, L. Girin, X. Alameda-Pineda, and R. Horaud,Exploiting the complementarity of audio and visual data in multi-speaker tracking,” in International Conference on Computer Vision - Workshop on Computer Vision for Audio-Visual Media, Venice, Italy, 2017.

[C75]. M. Fontaine, A. Liutkus, L. Girin, and R. Badeau, Explaining the parameterized Wiener filter with Alpha-stable processes, in IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), New Paltz, NJ, USA, 2017.

[C74]. X. Li, L. Girin, and R. Horaud, “An EM algorithm for audio source separation based on the convolutive transfer function,in IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), New Paltz, NJ, USA, 2017.

[C73]. D. Kounades-Bastian, L. Girin, X. Alameda-Pineda, R. Horaud, and S. Gannot, Exploiting the intermittency of speech for joint separation and diarization,” in IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), New Paltz, NJ, USA, 2017.

[C72]. L. Girin and R. Badeau, “On the use of latent mixing filters in audio source separation,” in International Conference on Latent Variable Analysis and Signal Separation (LVA/ICA), Grenoble, France, 2017.

[C71]. L. Girin, T. Hueber, and X. Alameda-Pineda, “Adaptation of a Gaussian mixture regressor to a new input distribution: Extending the C-GMR framework,” in International Conference on Latent Variable Analysis and Signal Separation (LVA/ICA), Grenoble, France, 2017.

[C70]. D. Kounades-Bastian, L. Girin, X. Alameda-Pineda, S. Gannot, and R. Horaud, “An EM algorithm for joint source separation and diarization of multichannel convolutive speech mixtures,” in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), New Orleans, Louisiana, USA, 2017.

[C69]. X. Li, L. Girin, and R. Horaud, “Audio source separation based on convolutive transfer function and frequency-domain Lasso optimization,” in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), New Orleans, Louisiana, USA, 2017.

[C68]. D. Kounades-Bastian, L. Girin, X. Alameda-Pineda, S. Gannot, and R. Horaud, “An inverse-Gamma source variance prior with factorized parameterization for audio source separation,” in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Shanghai, China, 2016, pp. 136–140.

[C67]. P. Laffitte, D. Sodoyer, C. Tatkeu, and L. Girin, “Deep neural networks for automatic detection of screams and shouted speech in subway trains,” in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Shanghai, China, 2016, pp. 6460–6464.

[C66]. X. Li, L. Girin, F. Badeig, and R. Horaud, “Reverberant sound localization with a robot head based on direct-path relative transfer function,” in IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Daejeon, South-Korea, 2016.

[C65]. X. Li, L. Girin, S. Gannot, and R. Horaud, “Non-stationary noise power spectral density estimation based on regional statistics,” in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Shanghai, China, 2016, pp. 181–185.

[C64]. X. Li, R. Horaud, L. Girin, and S. Gannot, “Voice activity detection based on statistical likelihood ratio with adaptive thresholding,” in IEEE International Workshop on Acoustic Signal Enhancement (IWAENC), Xian, China, 2016.

[C63]. F. Bocquelet, T. Hueber, L. Girin, C. Savariaux, and B. Yvert, “Real-time control of a DNN- based articulatory synthesizer for silent speech conversion: a pilot study,” in Conference of the International Speech Communication Association (INTERSPEECH), Dresden, Germany, 2015, pp. 2405–2409.

[C62]. D. Kounades-Bastian, L. Girin, X. Alameda-Pineda, S. Gannot, and R. Horaud, “A variational EM algorithm for the separation of moving sound sources,” in IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), New Paltz, NJ, USA, Best Student Paper Award, 2015, pp. 1–5.

[C61]. X. Li, L. Girin, R. Horaud, and S. Gannot, “Estimation of relative transfer function in the presence of stationary noise based on segmental power spectral density matrix subtraction,” in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Brisbane, Australia, 2015, pp. 320–324.

[C60]. X. Li, R. Horaud, L. Girin, and S. Gannot, “Local relative transfer function for sound source localization,” in European Signal Processing Conference (EUSIPCO), Nice, France, 2015, pp. 399– 403.

[C59]. F. Bocquelet, T. Hueber, L. Girin, P. Badin, and B. Yvert, “Robust articulatory speech synthesis using deep neural networks for BCI applications,” in Conference of the International Speech Communication Association (INTERSPEECH), Singapore, 2014.

[C58]. A. Deleforge, V. Drouard, L. Girin, and R. Horaud, “Mapping sounds onto images using binaural spectrograms,” in European Signal Processing Conference (EUSIPCO), Lisbon, Portugal, 2014, pp. 2470–2474.

[C57]. M. Janvier, X. Alameda-Pineda, L. Girin, and R. Horaud, “Sound representation and classification benchmark for domestic robots,” in IEEE International Conference on Robotics and Automation (ICRA), Hong-Kong, China, 2014, pp. 6285–6292.

[C56]. S. Kırbız, A. Ozerov, A. Liutkus, and L. Girin, “Perceptual coding-based informed source separation,” in European Signal Processing Conference (EUSIPCO), Lisbon, Portugal., 2014, pp. 959– 963.

[C55]. M. Janvier, R. Horaud, L. Girin, F. Berthommier, L.-J. Boë, C. Kemp, A. Rey, and T. Legou, “Supervised classification of baboon vocalizations,” in Neural Information Processing Scaled for Bioacoustics (NIPS4B), Lake Tahoe, Nevada, USA, 2013.

[C54]. S. Zhang, L. Girin, and A. Liutkus, “Informed source separation from compressed mixtures using spatial Wiener filter and quantization noise estimation,” in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Vancouver, Canada, 2013, pp. 61–65.

[C53]. F. Berthommier, L. Girin, and L.-J. Boë, “A simple hybrid acoustic and morphologically constrained technique for the synthesis of stop consonants in various vocalic contexts,” in Conference of the International Speech Communication Association (INTERSPEECH), Portland, USA, 2012.

[C52]. T. Gerber, M. Dutasta, L. Girin, and C. Févotte, “Professionally-produced music separation guided by covers,” in International Society for Music Information Retrieval Conference (ISMIR), Porto, Portugal, 2012.

[C51]. M. Janvier, X. Alameda-Pineda, L. Girin, and R. Horaud, “Sound-event recognition with a companion humanoid,” in IEEE/RAS International Conference on Humanoid Robots (Humanoids), Osaka, Japan, 2012.

[C50]. A. Liutkus, S. Gorlow, N. Sturmel, S. Zhang, L. Girin, R. Badeau, L. Daudet, S. Marchand, and G. Richard, “Informed audio source separation: a comparative study,” in European Signal Processing Conference (EUSIPCO), Bucharest, Romania, 2012, pp. 2397–2401.

[C49]. S. Marchand, R. Badeau, C. Baras, L. Daudet, D. Fourer, L. Girin, S. Gorlow, A. Liutkus, J. Pinel, G. Richard, et al., “DReaM: A novel system for joint source separation and multitrack coding,” in Audio Engineering Society (AES) Convention, San Francisco, USA, 2012.

[C48]. N. Sturmel, L. Daudet, and L. Girin, “Phase-based informed source separation for active listening of music,” in International Conference on Digital Audio Effects (DAFx), York, UK, 2012.

[C47]. N. Sturmel, A. Liutkus, J. Pinel, L. Girin, S. Marchand, G. Richard, R. Badeau, and L. Daudet, “Linear mixing models for active listening of music productions in realistic studio conditions,” in Audio Engineering Society (AES) Convention, Budapest, Hungary, Best Paper Award, 2012.

[C46]. F. Ben Ali, L. Girin, and S. Djaziri-Larbi, “A long-term harmonic plus noise model for speech signals,” in Conference of the International Speech Communication Association (INTERSPEECH), Florence, Italy, 2011, pp. 53–56.

[C45]. L. Girin and J. Pinel, “Informed audio source separation from compressed linear stereo mixtures,” in Audio Engineering Society (AES) Conference, Ilmenau, Germany, 2011.

[C44]. J. Pinel and L. Girin, “A high-rate data hiding technique for audio signals based on IntMDCT quantization,” in International Conference on Digital Audio Effects (DAFx), Paris, France, 2011, pp. 353–356.

[C43]. J. Pinel and L. Girin, “Sparsification of audio signals using the MDCT/IntMDCT and a psychoacoustic model. Application to informed audio source separation,” in Audio Engineering Society (AES) Conference, Ilmenau, Germany, 2011.

[C42]. S. Zhang and L. Girin, “An informed source separation system for speech signals,” in Conference of the International Speech Communication Association (INTERSPEECH), Florence, Italy, 2011, pp. 573–576.

[C41]. F. Ben Ali, L. Girin, and S. Djaziri-Larbi, “Long-term modelling of parameters trajectories for the harmonic plus noise model of speech signals,” in International Congress on Acoustics (ICA), Sydney, Australia, 2010.

[C40]. S. Marchand, B. Mansencal, and L. Girin, “Interactive music with active audio CDs,” in International Symposium on Computer Music Modeling and Retrieval (CMMR), Malaga, Spain, 2010.

[C39]. M. Mazuel, D. David, and L. Girin, “Linking motion sensors and digital signal processing for real- time musical transformations,” in International Conference on Haptic Audio Interaction Design (HAID) (demo session), Copenhagen, Denmark, 2010.

[C38]. M. Parvaix and L. Girin, “Informed source separation of underdetermined instantaneous stereo mixtures using source index embedding,” in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Dallas, USA, 2010, pp. 245–248.

[C37]. M. Parvaix, L. Girin, L. Daudet, J. Pinel, and C. Baras, “Hybrid coding/indexing strategy for informed source separation of linear instantaneous under-determined audio mixtures,” in International Congress on Acoustics (ICA), Sydney, Australia, 2010.

[C36]. J. Pinel, L. Girin, C. Baras, and M. Parvaix, “A high-capacity watermarking technique for audio signals based on MDCT-domain quantization,” in International Congress on Acoustics (ICA), Sydney, Australia, 2010.

[C35]. M. Parvaix, L. Girin, and J.-M. Brossier, “A watermarking-based method for single-channel audio source separation,” in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Taipei, Taiwan, 2009, pp. 101–104.

[C34]. M. Firouzmand and L. Girin, “Long-term flexible 2D cepstral modeling of speech spectral amplitudes,” in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Las Vegas, USA, 2008, pp. 3937–3940.

[C33]. K. Hermus, L. Girin, H. Van hamme, and S. Irhimeh, “Estimation of the voicing cut-off frequency contour of natural speech based on harmonic and aperiodic energies,” in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Las Vegas, USA, 2008, pp. 4473–4476.

[C32]. A. Aubrey, B. Rivet, Y. Hicks, L. Girin, J. Chambers, and C. Jutten, “Two novel visual voice activity detectors based on appearance models and retinal filtering,” in European Signal Processing Conference (EUSIPCO), Poznan, Poland, 2007, pp. 2409–2413.

[C31]. D. Beautemps, L. Girin, N. Aboutabit, G. Bailly, L. Besacier, G. Breton, T. Burger, A. Caplier, M.-A. Cathiard, D. Chêne, et al., “TELMA: Telephony for the hearing-impaired people. from models to user tests,” in International Conference on Assistive Technologies (ASSISTH), Toulouse, France, 2007, pp. 201–208.

[C30]. L. Girin, “Long-term quantization of speech LSF parameters,” in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Honolulu, Hawaii, USA, 2007.

[C29]. B. Rivet, A. J. Aubrey, L. Girin, Y. Hicks, C. Jutten, and J. Chambers, “Development and comparison of two approaches for visual speech analysis with application to voice activity detection.,” in International Conference on Audio-Visual Speech Processing (AVSP), Hilvarenbeek, The Netherlands, 2007, p. 14.

[C28]. B. Rivet, L. Girin, C. Serviere, D.-T. Pham, and C. Jutten, “Audiovisual speech source separation: a regularization method based on visual voice activity detection,” in International Conference on Audio-Visual Speech Processing (AVSP), Hilvarenbeek, The Netherlands, 2007.

[C27]. B. Rivet, L. Girin, C. Serviere, D.-T. Pham, and C. Jutten, “Using a visual voice activity detector to regularize the permutations in blind separation of convolutive speech mixtures,” in IEEE International Conference on Digital Signal Processing (DSP), Cardiff, Wales, 2007, pp. 223–226.

[C26]. L. Girin, “Theoretical and experimental bases of a new method for accurate separation of harmonic and noise components of speech signals,” in European Signal Processing Conference (EUSIPCO), Florence, Italy, 2006, pp. 1–5.

[C25]. D. Sodoyer, B. Rivet, L. Girin, J.-L. Schwartz, and C. Jutten, “An analysis of visual speech information applied to voice activity detection,” in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Toulouse, France, 2006.

[C24]. M. Firouzmand and L. Girin, “Perceptually weighted long term modeling of sinusoidal speech amplitude trajectories,” in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Philadelphia, USA, 2005.

[C23]. M. Firouzmand, L. Girin, and S. Marchand, “Comparing several models for perceptual long-term modeling of amplitude and phase trajectories of sinusoidal speech,” in Conference of the International Speech Communication Association (INTERSPEECH), Lisbon, Portugal, 2005, pp. 357– 360.

[C22]. M. Raspaud, S. Marchand, and L. Girin, “A generalized polynomial and sinusoidal model for partial tracking and time stretching,” in International Conference on Digital Audio Effects (DAFx), Madrid, Spain, 2005, pp. 24–29.

[C21]. B. Rivet, L. Girin, and C. Jutten, “Solving the indeterminations of blind source separation of convolutive speech mixtures,” in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Philadelphia, USA, 2005.

[C20]. D. Beautemps, T. Burger, and L. Girin, “Characterizing and classifying cued speech vowels from labial parameters,” in International Conference on Spoken Language Processing (ICSLP), Jeju, South Korea, 2004.

[C19]. L. Girin, M. Firouzmand, and S. Marchand, “Long term modeling of phase trajectories within the speech sinusoidal model framework,” in International Conference on Spoken Language Processing (ICSLP), Jeju, South-Korea, 2004.

[C18]. L. Girin and S. Marchand, “Watermarking of speech signals using the sinusoidal model and frequency modulation of the partials,” in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Montreal, Canada, 2004.

[C17]. B. Rivet, L. Girin, C. Jutten, and J.-L. Schwartz, “Using audiovisual speech processing to improve the robustness of the separation of convolutive speech mixtures,” in IEEE Workshop on Multimedia Signal Processing (MMSP), Siena, Italy, 2004, pp. 47–50.

[C16]. L. Girin, “Pure audio McGurk effect,” in International Conference on Audio-Visual Speech Processing (AVSP), Saint-Jorioz, France, 2003.

[C15]. L. Girin, S. Marchand, J. Di Martino, A. Robel, and G. Peeters, “Comparing the order of a polynomial phase model for the synthesis of quasi-harmonic audio signals,” in IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), New Paltz, USA, 2003, pp. 193–196.

[C14]. D. Sodoyer, L. Girin, C. Jutten, and J.-L. Schwartz, “Extracting an AV speech source from a mixture of signals,” in European Conference on Speech Communication and Technology (EUROSPEECH), Geneva, Switzerland, 2003.

[C13]. D. Sodoyer, L. Girin, C. Jutten, and J.-L. Schwartz, “Further experiments on audio-visual speech source separation,” in International Conference on Audio-Visual Speech Processing (AVSP), Saint-Jorioz, France, 2003.

[C12]. D. Sodoyer, L. Girin, C. Jutten, and J.-L. Schwartz, “Speech extraction based on ICA and audio-visual coherence,” in IEEE International Symposium on Signal Processing and its Applications (ISSPA), Paris, France, 2003.

[C11]. D. Sodoyer, L. Girin, C. Jutten, and J.-L. Schwartz, “Audio-visual speech source separation,” in International Conference on Spoken Language Processing (ICSLP), Denver, USA, 2002.

[C10]. L. Girin, A. Allard, and J.-L. Schwartz, “Speech signals separation: a new approach exploiting the coherence of audio and visual speech,” in IEEE Workshop on Multimedia Signal Processing (MMSP), Cannes, France, 2001, pp. 631–636.

[C9]. E. Foucher, G. Feng, and L. Girin, “A preliminary study of an audio-visual speech coder: using video parameters to reduce an LPC vocoder bit rate,” in European Signal Processing Conference (EUSIPCO), Rhodes, Greece, 1998, pp. 1–4.

[C8]. E. Foucher, L. Girin, and G. Feng, “An audiovisual speech coder using vector quantization to exploit the audio/video correlation,” in International Conference on Audio-Visual Speech Processing (AVSP), Sydney, Australia, 1998.

[C7]. L. Girin, E. Foucher, and G. Feng, “An audio-visual distance for audio-visual speech vector quantization,” in IEEE Workshop on Multimedia Signal Processing (MMSP), Los Angeles, USA, 1998.

[C6]. L. Girin, G. Feng, and J.-L. Schwartz, “Fusion of auditory and visual information for noisy speech enhancement: a preliminary study of vowel transitions,” in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Seattle, USA, 1998, pp. 1005–1008.

[C5]. L. Girin, L. Varin, G. Feng, and J.-L. Schwartz, “A signal processing system for having the sound pop-out in noise thanks to the image of the speaker’s lips: new advances using multi-layer perceptrons,” in International Conference on Spoken Language Processing (ICSLP), Sydney, Australia, 1998.

[C4]. L. Girin, L. Varin, G. Feng, and J.-L. Schwartz, “Audiovisual speech enhancement: new advances using multi-layer perceptrons,” in IEEE Workshop on Multimedia Signal Processing (MMSP), Los Angeles, USA, 1998.

[C3]. L. Girin, G. Feng, and J.-L. Schwartz, “Noisy speech enhancement by fusion of auditory and visual information: a study of vowel transitions,” in European Conference on Speech Communication and Technology (EUROSPEECH), Rhodes, Greece, 1997.

[C2]. L. Girin, J.-L. Schwartz, and G. Feng, “Can the visual input make the audio signal pop out in noise? A first study of the enhancement of noisy VCV acoustic sequences by audio-visual fusion,” in International Conference on Audio-Visual Speech Processing (AVSP), Rhodes, Greece, 1997.

[C1]. L. Girin, G. Feng, and J.-L. Schwartz, “Noisy speech enhancement with filters estimated from the speaker’s lips,” in European Conference on Speech Communication and Technology (EUROSPEECH), Madrid, Spain, 1995.

Patents

[B3]. N. Sturmel, L. Daudet, and L. Girin, “Procédé de traitement numérique sur un ensemble de pistes audio avant mixage,” Brevet, demande française n°11/61635 du 14/12/2011, déposé aux noms du CNRS, de l’Institut Polytechnique de Grenoble et de l’Université Paris Diderot, publié le 21/06/2013 (FR 2984579), et étendu à l’international (WO 2013087638). Ref. hal-01021287.

[B2]. A. Liutkus, L. Girin, G. Richard, and R. Badeau, “Procédé et dispositif de formation d’un signal mixé numérique audio, procédé et dispositif de séparation de signaux, et signal correspondant,” Brevet, demande française n°10/58348 du 11/10/2010, déposé aux noms de l’Institut Polytechnique de Grenoble et de Télécom ParisTech, publié le 20/04/2012 (FR 2966277), et étendu à l’international (EP 2628154, WO 2012049176). Ref. hal-00945254.

[B1]. M. Parvaix, L. Girin, J.-M. Brossier, and S. Marchand, “Procédé et dispositif de formation d’un signal mixé, procédé et dispositif de séparation de signaux, et signal correspondant,” Brevet, demande française n°09/52397 du 10/04/2009, déposé aux noms de l’Institut Polytechnique de Grenoble et de l’Université de Bordeaux 1, publié le 15/10/2010 (FR 2944403), et étendu à l’international (EP 2417597, WO 2010116068). Ref. hal-01021265.



Grenoble Images Parole Signal Automatique laboratoire

UMR 5216 CNRS - Grenoble INP - Université Joseph Fourier - Université Stendhal