Skip to Main content Skip to Navigation

Unsupervised component analysis for neuroimaging data

Abstract : This thesis in computer science and mathematics is applied to the field ofneuroscience, and more particularly to the mapping of brain activity based on imaging electrophysiology. In this field, a rising trend is to experiment with naturalistic stimuli such as movie watching or audio track listening,rather than tightly controlled but outrageously simple stimuli. However, the analysis of these "naturalistic" stimuli and their effects requires a huge amount of images that remain hard and costly to acquire. Without mathematical modeling, theidentification of neural signal from the measurements is very hard if not impossible. However, the stimulations that elicit neural activity are challenging to model in this context, and therefore, the statistical analysis of the data using regression-based approaches is difficult. This has motivated the use of unsupervised learning methods that do not make assumptions about what triggers brain activations in the presented stimuli. In this thesis, we first consider the case of the shared response model (SRM), wheresubjects are assumed to share a common response. While this algorithm is usefulto perform dimension reduction, it is particularly costly on functional magneticresonance imaging (fMRI) data where thedimension can be very large. We considerably speed up thealgorithm and reduce its memory usage. However, SRM relies on assumptions thatare not biologically plausible. In contrast, independent component analysis (ICA) is more realistic but not suited to multi-subject datasets. In this thesis, we present a well-principled method called MultiViewICA that extends ICA to datasets containing multiple subjects. MultiViewICA is a maximum likelihood estimator. It comes with a closed-formlikelihood that can be efficiently optimized. However, it assumes the same amount of noise for all subjects. We therefore introduce ShICA, a generalization of MultiViewICA that comes with a more general noise model. In contrast to almost all ICA-based models, ShICA can separate Gaussian and non-Gaussian sources and comes with a minimum mean square error estimate of the common sources that weights each subject according to its estimated noise level. In practice, MultiViewICA and ShICA yield on magnetoencephalography and functional magnetic resonance imaging a more reliable estimateof the shared response than competitors. Lastly, we use independent component analysis as a basis to perform data augmentation. More precisely, we introduce CondICA, a data augmentation method that leverages a large amount of unlabeled fMRI data to build a generative model for labeled data using only a few labeled samples. CondICA yields an increase in decoding accuracy on eight large fMRI datasets. Our main contributions consist in the reduction of SRM's training time as well as in the introduction of two more realistic models for the analysis of brain activity of subjects exposed to naturalistic stimuli: MultiViewICA and ShICA. Lastly, our results showing that ICA can be used for data augmentation are promising. In conclusion, we present some directions that could guide future work. From apractical point of view, minor modifications of our methods could allow theanalysis of resting state data assuming a shared spatial organization instead of a shared response. From a theoretical perspective, future work could focus on understanding how dimension reduction and shared response identification can be achieved jointly.
Complete list of metadata
Contributor : ABES STAR :  Contact
Submitted on : Tuesday, January 18, 2022 - 8:59:54 AM
Last modification on : Friday, February 4, 2022 - 3:12:48 AM


Version validated by the jury (STAR)


  • HAL Id : tel-03531027, version 1


Hugo Richard. Unsupervised component analysis for neuroimaging data. Artificial Intelligence [cs.AI]. Université Paris-Saclay, 2021. English. ⟨NNT : 2021UPASG115⟩. ⟨tel-03531027⟩



Record views


Files downloads