Codage audio stéréo avancé

Abstract : During the last ten years, technics for joint coding exploiting relations and redundancies between channels have been developped in order to further reduce the amount of information needed to represent multichannel audio signals.In this document, we focus on the coding of stereo audio signals where prior informations on the nature of sources in presence, their number or the manner they are spatialized is unknown. Such signals are actually the most representative in commercial records of music industry and in multimedia entertainment in general. To address the coding problematic of these signals, we study parametric and signal approaches, where both of them are often mixed.In this context, three types of approaches are used. The spatial parametric approach reduce the number of audio channels of the signal to encode and recreate the original number of channels from reduced channels and spatial parameters extracted from original channels. The signal approach keep the original number of channels, but encode mono signals, built from the combination of the original ones and containing less redundancies. Finally, the hybrid approach introduced in the MPEG USAC standard keep the two channels of a stereo signal, but one is a mono downmix and the other is a residual signal, resulting from a prediction on the downmix, where prediction parameters are encoded as side information.In this document, we first analyse the characteristics of a stereo audio signal coming from a commercial recording and the associated production techniques. This study lead us to consider the relations between the emitter parametric models, elaborated from our analysis of commercial recording production techniques, and the receiver models which are the basis of spatial parametric coding. In the light of these considerations, we present and study the three approaches mentioned earlier. For the parametric approach, we show that transparency cannot be achieved for most of the stereo audio signals, we have a reflection on parametric representations and we propose techniques to improve the audio quality and further reduce the bitrate of their parameters. These improvements are obtained by applying a better segmentation on the signal, based on the significant transient, by exploiting perceptive characteristics of some spatial cues and by adapting the estimation of spatial cues. As the hybrid approach has been recently standardized in MPEG USAC, we propose a full review of it, then we develop a new coding technique to optimize the allocation of the residual bands when the residual is not used on the whole bandwidth of the signal to encode. In the conclusion, we discuss about the future of the general spatial audio coding and we show the importance of developping new technics of segmentation and classification for audio signals to further adapt the coding to the content of the signal.
Complete list of metadatas

Cited literature [41 references]  Display  Hide  Download
Contributor : Abes Star <>
Submitted on : Friday, July 7, 2017 - 2:47:22 PM
Last modification on : Thursday, July 4, 2019 - 11:00:07 AM
Long-term archiving on : Thursday, December 14, 2017 - 2:46:34 PM


Version validated by the jury (STAR)


  • HAL Id : tel-01401268, version 1


Julien Capobianco. Codage audio stéréo avancé. Traitement du signal et de l'image [eess.SP]. Université Pierre et Marie Curie - Paris VI, 2015. Français. ⟨NNT : 2015PA066712⟩. ⟨tel-01401268⟩



Record views


Files downloads