Skip to Main content Skip to Navigation

Amélioration de l'intelligibilité de signaux audio de parole en contexte bruité automobile

Abstract : Speech is nowadays present in a number of in-car applications ranging from hands-free communications, radio programs to speech synthesis messages from the various car devices.However, despite the steady car manufacturing progress, significant noise still remains in the car interior that leads to a loss of intelligibility of speech signals. The PhD work aims at developping speech reinforcement tools in order to process the signals before they are played in a noisy in-car environment.A highly effective speech reinforcement approach is to use a frequency equalizer to optimize an intelligibility criterion : the Speech Intelligibility Index (SII). To facilitate optimization, current methods are based on approximations of the criterion. In addition, by concentrating the spectral energy of the signal in areas where the ear is more sensitive, these methods increase the perceived volume which can deteriorate the user experience. Thus, in addition to proposing an exact method of solving the SII maximization problem, our work proposes to introduce and study the influence of a new perceptual constraint in order to maintain the signals at their perceived level.The popularization of machine learning approaches pushes to learn speech reinforcement processings from examples naturally produced in noise (Lombard speech), or by over-articulation (clear speech). Current work fails to achieve intelligibility gains as significant as with natural modification, and we believe that the many temporal aspects neglect may be partially responsible. Our work therefore proposes to deepen these approaches by exploiting learning models and pre-processings adapted to long duration sequences. We also propose a new modeling of the speech rate modifications that directly fits in the machine learning model which had never been done before.
Complete list of metadata
Contributor : ABES STAR :  Contact
Submitted on : Monday, May 23, 2022 - 8:20:37 AM
Last modification on : Tuesday, May 24, 2022 - 3:09:54 AM


Version validated by the jury (STAR)


  • HAL Id : tel-03675219, version 1



Enguerrand Gentet. Amélioration de l'intelligibilité de signaux audio de parole en contexte bruité automobile. Traitement du signal et de l'image [eess.SP]. Institut Polytechnique de Paris, 2021. Français. ⟨NNT : 2021IPPAT008⟩. ⟨tel-03675219⟩



Record views


Files downloads