Skip to Main content Skip to Navigation
Theses

Analyse de scène sonore multi-capteurs : un front-end temps-réel pour la manipulation de scène

Abstract : The context of this thesis is the development of spatialized audio (5.1 contents, Dolby Atmos...) and particularly of 3D audio. Among the existing 3D audio formats, Ambisonics and Higher Order Ambisonics (HOA) allow a homogeneous spatial representation of a sound field and allows basics manipulations, like rotations or distorsions. The aim of the thesis is to provides efficient tools for ambisonics and HOA sound scene analyse and manipulations. A real-time implementation and robustness to reverberation are the main constraints to deal with. The implemented algorithm is based on a frame-by-frame Independent Component Analysis (ICA), wich decomposes the sound field into a set of acoustic contributions. Then a bayesian classification step is applied to the extracted components to identify the real sources and the residual reverberation. Direction of arrival of the sources are extracted from the mixing matrix estimated by ICA, according to the ambisonic formalism, and a real-time cartography of the sound scene is obtained. Performances have been evaluated in different acoustic environnements to assess the influence of several parameters such as the ambisonic order, the frame length or the number of sources. Accurate results in terms of source localization and source counting have been obtained for frame lengths of a few hundred milliseconds. The algorithm is exploited as a pre-processing step for a speech recognition prototype and allows a significant increasing of the recognition results, in far field conditions and in the presence of noise and interferent sources.
Complete list of metadatas

Cited literature [86 references]  Display  Hide  Download

https://tel.archives-ouvertes.fr/tel-01792433
Contributor : Abes Star :  Contact
Submitted on : Tuesday, May 15, 2018 - 2:39:05 PM
Last modification on : Tuesday, March 31, 2020 - 3:22:44 PM
Long-term archiving on: : Monday, September 24, 2018 - 2:01:22 PM

File

2017LEMA1013.pdf
Version validated by the jury (STAR)

Identifiers

  • HAL Id : tel-01792433, version 1

Citation

Mathieu Baque. Analyse de scène sonore multi-capteurs : un front-end temps-réel pour la manipulation de scène. Acoustique [physics.class-ph]. Université du Maine, 2017. Français. ⟨NNT : 2017LEMA1013⟩. ⟨tel-01792433⟩

Share

Metrics

Record views

200

Files downloads

354