# Sparse high dimensional regression in the presence of colored heteroscedastic noise: application to M/EEG source imaging

1 PARIETAL - Modelling brain structure, function and variability based on high-field MRI data
Inria Saclay - Ile de France, NEUROSPIN - Service NEUROSPIN
Abstract : Understanding the functioning of the brain under normal and pathological conditions is one of the challenges of the 21st century. In the last decades, neuroimaging has radically affected clinical and cognitive neurosciences. Amongst neuroimaging techniques, magneto- and electroencephalography (M/EEG) stand out for two reasons: their non-invasiveness, and their excellent time resolution. Reconstructing the neural activity from the recordings of magnetic field and electric potentials is the so-called bio-magnetic inverse problem. Because of the limited number of sensors, this inverse problem is severely ill-posed, and additional constraints must be imposed in order to solve it. A popular approach, considered in this manuscript, is to assume spatial sparsity of the solution: only a few brain regions are involved in a short and specific cognitive task. Solutions exhibiting such a neurophysiologically plausible sparsity pattern can be obtained through L21-penalized regression approaches. However, this regularization requires to solve time-consuming high-dimensional and non-smooth optimization problems, with iterative (block) proximal gradients solvers. Additionally, M/EEG recordings are usually corrupted by strong non-white noise, which breaks the classical statistical assumptions of inverse problems. To circumvent this, it is customary to whiten the data as a preprocessing step, and to average multiple repetitions of the same experiment to increase the signal-to-noise ratio. Averaging measurements has the drawback of removing brain responses which are not phase-locked, i.e. do not happen at a fixed latency after the stimuli presentation onset. In this work, we first propose speed improvements of iterative solvers used for the L21-regularized bio-magnetic inverse problem. Typical improvements, screening and working sets, exploit the sparsity of the solution: by identifying inactive brain sources, they reduce the dimensionality of the optimization problem. We introduce a new working set policy, derived from the state-of-the-art Gap safe screening rules. In this framework, we also propose duality improvements, yielding a tighter control of optimality and improving feature identification techniques. This dual construction extrapolates on an asymptotic Vector AutoRegressive regularity of the dual iterates, which we connect to manifold identification of proximal algorithms. Beyond the L21-regularized bio-magnetic inverse problem, the proposed methods apply to the whole class of sparse Generalized Linear Models. Second, we introduce new concomitant estimators for multitask regression. Along with the neural sources estimation, concomitant estimators jointly estimate the noise covariance matrix. We design them to handle non-white Gaussian noise, and to exploit the multiple repetitions nature of M/EEG experiments. Instead of averaging the observations, our proposed method, CLaR, uses them all for a better estimation of the noise. The underlying optimization problem is jointly convex in the regression coefficients and the noise variable, with a smooth + proximable'' composite structure. It is therefore solvable via standard alternate minimization, for which we apply the improvements detailed in the first part. We provide a theoretical analysis of our objective function, linking it to the smoothing of Schatten norms. We demonstrate the benefits of the proposed approach for source localization on real M/EEG datasets. Our improved solvers and refined modeling of the noise pave the way for a faster and more statistically efficient processing of M/EEG recordings, allowing for interactive data analysis and scaling approaches to larger and larger M/EEG datasets.
Keywords :
Document type :
Theses
Domain :

Cited literature [200 references]

https://tel.archives-ouvertes.fr/tel-02401628
Contributor : Mathurin Massias <>
Submitted on : Monday, January 20, 2020 - 11:25:17 AM
Last modification on : Wednesday, January 22, 2020 - 1:27:06 AM

### File

thesis_tel.pdf
Files produced by the author(s)

### Identifiers

• HAL Id : tel-02401628, version 2

### Citation

Mathurin Massias. Sparse high dimensional regression in the presence of colored heteroscedastic noise: application to M/EEG source imaging. Machine Learning [stat.ML]. Telecom Paristech, 2019. English. ⟨tel-02401628v2⟩

Record views