Skip to Main content Skip to Navigation
Theses

Neuro-steered music source separation

Abstract : In this PhD thesis, we address the challenge of integrating Brain-Computer Interfaces (BCI) and music technologies on the specific application of music source separation, which is the task of isolating individual sound sources that are mixed in the audio recording of a musical piece. This problem has been investigated for decades, but never considering BCI as a possible way to guide and inform separation systems. Specifically, we explored how the neural activity characterized by electroencephalographic signals (EEG) reflects information about the attended instrument and how we can use it to inform a source separation system.First, we studied the problem of EEG-based auditory attention decoding of a target instrument in polyphonic music, showing that the EEG tracks musically relevant features which are highly correlated with the time-frequency representation of the attended source and only weakly correlated with the unattended one. Second, we leveraged this ``contrast'' to inform an unsupervised source separation model based on a novel non-negative matrix factorisation (NMF) variant, named contrastive-NMF (C-NMF) and automatically separate the attended source.Unsupervised NMF represents a powerful approach in such applications with no or limited amounts of training data as when neural recording is involved. Indeed, the available music-related EEG datasets are still costly and time-consuming to acquire, precluding the possibility of tackling the problem with fully supervised deep learning approaches. Thus, in the last part of the thesis, we explored alternative learning strategies to alleviate this problem. Specifically, we propose to adapt a state-of-the-art music source separation model to a specific mixture using the time activations of the sources derived from the user's neural activity. This paradigm can be referred to as one-shot adaptation, as it acts on the target song instance only.We conducted an extensive evaluation of both the proposed system on the MAD-EEG dataset which was specifically assembled for this study obtaining encouraging results, especially in difficult cases where non-informed models struggle.
Complete list of metadata

https://tel.archives-ouvertes.fr/tel-03511225
Contributor : ABES STAR :  Contact
Submitted on : Tuesday, January 4, 2022 - 6:55:09 PM
Last modification on : Wednesday, January 5, 2022 - 3:06:14 AM

File

104623_CANTISANI_2021_archivag...
Version validated by the jury (STAR)

Identifiers

  • HAL Id : tel-03511225, version 1

Collections

Citation

Giorgia Cantisani. Neuro-steered music source separation. Signal and Image Processing. Institut Polytechnique de Paris, 2021. English. ⟨NNT : 2021IPPAT038⟩. ⟨tel-03511225⟩

Share

Metrics

Record views

277

Files downloads

58