Skip to Main content Skip to Navigation

Projection d'espaces acoustiques : Une approche par apprentissage automatisé de la séparation et de la localisation de sources sonores

Abstract : In this thesis, we address the long-studied problem of binaural (two microphones) sound source separation and localization through supervised leaning. To achieve this, we develop a new paradigm referred as acoustic space mapping, at the crossroads of binaural perception, robot hearing, audio signal processing and machine learning. The proposed approach consists in learning a link between auditory cues perceived by the system and the emitting sound source position in another modality of the system, such as the visual space or the motor space. We propose new experimental protocols to automatically gather large training sets that associates such data. Obtained datasets are then used to reveal some fundamental intrinsic properties of acoustic spaces and lead to the development of a general family of probabilistic models for locally-linear high- to low-dimensional space mapping. We show that these models unify several existing regression and dimensionality reduction techniques, while encompassing a large number of new models that generalize previous ones. The properties and inference of these models are thoroughly detailed, and the prominent advantage of proposed methods with respect to state-of-the-art techniques is established on different space mapping applications, beyond the scope of auditory scene analysis. We then show how the proposed methods can be probabilistically extended to tackle the long-known cocktail party problem, i.e., accurately localizing one or several sound sources emitting at the same time in a real-word environment, and separate the mixed signals. We show that resulting techniques perform these tasks with an unequaled accuracy. This demonstrates the important role of learning and puts forwards the acoustic space mapping paradigm as a promising tool for robustly addressing the most challenging problems in computational binaural audition.
Document type :
Theses
Complete list of metadatas

Cited literature [107 references]  Display  Hide  Download

https://tel.archives-ouvertes.fr/tel-01134012
Contributor : Abes Star :  Contact
Submitted on : Saturday, March 21, 2015 - 10:03:06 AM
Last modification on : Thursday, March 26, 2020 - 8:49:16 PM

File

DELEFORGE_Antoine_2013_archiva...
Version validated by the jury (STAR)

Identifiers

  • HAL Id : tel-01134012, version 1

Collections

STAR | CNRS | INRIA | UGA

Citation

Antoine Deleforge. Projection d'espaces acoustiques : Une approche par apprentissage automatisé de la séparation et de la localisation de sources sonores. Autre. Université de Grenoble, 2013. Français. ⟨NNT : 2013GRENM033⟩. ⟨tel-01134012⟩

Share

Metrics

Record views

1668

Files downloads

435