Skip to Main content Skip to Navigation

New contributions to audio source separation and diarisation of Multichannel Convolutive Mixtures

Abstract : In this thesis we address the problem of audio source separation (ASS) for multichannel and underdetermined convolutive mixtures through probabilistic modeling. We focus on three aspects of the problem and make three contributions. Firstly, inspired from the empirically well validated representation of an audio signal, that is know as local Gaussian signal model (LGM) with non-negative matrix factorization (NMF), we propose a Bayesian extension to this, that overcomes some of the limitations of the NMF. We incorporate this representation in a multichannel ASS framework and compare it with the state of the art in ASS, yielding promising results.Secondly, we study how to separate mixtures of moving sources and/or of moving microphones.Movements make the acoustic path between sources and microphones become time-varying.Addresing time-varying audio mixtures appears is not so popular in the ASS literature.Thus, we begin from a state of the art LGM-with-NMF method designed for separating time-invariant audiomixtures and propose an extension that uses a Kalman smoother to track the acoustic path across time.The proposed method is benchmarked against a block-wise adaptation of that state of the art (ran on time segments),and delivers competitive results on both simulated and real-world mixtures.Lastly, we investigate the link between ASS and the task of audio diarisation.Audio diarisation is the recognition of the time intervals of activity of every speaker/source in the mix.Most state of the art ASS methods consider the sources ceaselssly emitting; A hypothesis that can result in spurious signal estimates for a source, in intervals where that source was not emitting.Our aim is that diarisation can aid ASS by indicating the emitting sources at each time frame.To that extent we design a joint framework for simultaneous diarization and ASS,that incorporates a hidden Markov model (HMM) to track the temporal activity of the sources, within a state of the art LGM-with-NMF ASS framework.We compare the proposed method with the state of the art in ASS and audio diarisation tasks.We obtain performances comparable, with the state of the art, in terms of separation and outperformant in terms of diarisation.
Document type :
Complete list of metadata

Cited literature [49 references]  Display  Hide  Download
Contributor : Abes Star :  Contact
Submitted on : Friday, January 12, 2018 - 3:06:47 PM
Last modification on : Wednesday, November 4, 2020 - 3:15:46 PM
Long-term archiving on: : Wednesday, May 23, 2018 - 8:03:26 PM


Version validated by the jury (STAR)


  • HAL Id : tel-01681361, version 2



Dionyssos Kounadis-Bastian. New contributions to audio source separation and diarisation of Multichannel Convolutive Mixtures. Sound [cs.SD]. Université Grenoble Alpes, 2017. English. ⟨NNT : 2017GREAM012⟩. ⟨tel-01681361v2⟩



Record views


Files downloads