Skip to Main content Skip to Navigation
Theses

Independent component analysis by wavelets

Abstract : Independent component analysis (ICA) is a form of multivariate analysis that emerged as a concept in the eighties/nineties. It is a type of inverse problem where one observes a variable X whose components are linear mixtures of an unobservable variable S. The components of S are mutually independent. The relation between both variables is expressed by X=AS, where A is an unknown mixing matrix.

The main problem in ICA is to estimate the matrix A, seeing an i.i.d. sample of X, to reach S which constitutes a better explicative system than X, in the study of some phenomena. The problem is generally resolved through the minimization of a criteria coming from some dependence measure.

ICA looks like principal component analysis (PCA) in the formulation. In PCA, one seeks after uncorrelated components, that is to say pairwise independent at order 2 ; as for ICA, one seeks after mutually independent components, which is much more constraining, and there is not any more a simple algebraic solution in the general case. The main problems in the identification of A are removed by restrictions imposed in the classical ICA model.

The approach which is proposed in this thesis adopts a non parametric point of view. Under Besov assumptions, we study several estimators of an exact dependence criteria given by the L2 norm between a density and the product of its marginals. This criteria constitutes an alternative to mutual information which represented so far the exact criteria of reference for the majority of ICA methods.

We give an upper bound of the mean squared error of different estimators of the L2 contrast. This bound takes into account the approximation bias between the Besov space and the projection space which, here, stems from a multiresolution analysis (MRA) generated by the tensorial product of Daubechies wavelets. This type of bound, taking into account the approximation bias, is generally absent from recent non parametric methods in ICA (kernel methods, mutual information).

The L2 norm criteria makes it possible to get closer to well-known problems in the statistical literature, estimation of integral of squared f, L2 norm homogeneity tests, convergence rates for estimators adopting block thresholding.

We propose estimators of the L2 contrast that reach the optimal minimax rate of the problem integral of squared f. These estimators, of U-statistic type, have numerical complexities quadratic in n, which can be a problem for the contrast minimization to follow, to obtain a concrete estimation of matrix A. However these estimators also admit a block-thresholded version, where knowledge of the regularity s of the underlying multivariate density is useless to obtain an optimal rate.

We propose a plug-in type estimator whose convergence rate is sub-optimal but with a numerical complexity linear in n. The plug-in estimator also admits a term by term thresholded version, which dampens the convergence rate but yields an adaptive criteria. In its linear version, the plug-in estimator already seems auto-adaptive in facts, that is to say under the constraint 2^{jd} < n, where d is the dimension of the problem and n the number of observations, the majority of resolutions j allow to estimate A after minimization.

To obtain these results, we had to develop specific combinatorial tools, that allow to bound the rth moment of a U-statistic or a V-statistic. Standard results on U-statistics are indeed not directly usable and not easily adaptable in the context of study of the thesis. The tools that were developed are usable in other contexts.

The wavelet method builds upon the usual paradigm, estimation of an independence criteria, then minimization. So we study in the thesis the elements useful for minimization. In particular we give filter aware formulations of the gradient and the hessian of the contrast estimator, that can be computed with a complexity equivalent to that of the estimator itself.

Simulations proposed in the thesis confirm the applicability of the method and give excellent results. All necessary information for the implementation of the method, and the commented code of key parts of the program (notably d-dimensional algorithms) also appear in the document.
Document type :
Theses
Complete list of metadatas

https://tel.archives-ouvertes.fr/tel-00119428
Contributor : Pascal Barbedor <>
Submitted on : Saturday, December 9, 2006 - 4:08:27 PM
Last modification on : Friday, March 27, 2020 - 3:54:28 AM
Long-term archiving on: : Tuesday, April 6, 2010 - 8:45:55 PM

Identifiers

  • HAL Id : tel-00119428, version 1

Citation

Pascal Barbedor. Independent component analysis by wavelets. Mathematics [math]. Université Paris-Diderot - Paris VII, 2006. English. ⟨tel-00119428⟩

Share

Metrics

Record views

712

Files downloads

915