Mixture models for clustering and dimension reduction

Jakob Verbeek

Résumé

In Chapter 1 we give a general introduction and motivate the need for clustering and dimension reduction methods. We continue in Chapther 2 with a review of different types of existing clustering and dimension reduction methods.

In Chapter 3 we introduce mixture densities and the expectation-maximization (EM) algorithm to estimate their parameters. Although the EM algorithm has many attractive properties, it is not guaranteed to return optimal parameter estimates. We present greedy EM parameter estimation algorithms which start with a one-component mixture and then iteratively add a component to the mixture and re-estimate the parameters of the current mixture. Experimentally, we demonstrate that our algorithms avoid many of the sub-optimal estimates returned by the EM algorithm. Finally, we present an approach to accelerate mixture densities estimation from many data points. We apply this approach to both the standard EM algorithm and our greedy EM algorithm.

In Chapter 4 we present a non-linear dimension reduction method that uses a constrained EM algorithm for parameter estimation. Our approach is similar to Kohonen's self-organizing map, but in contrast to the self-organizing map, our parameter estimation algorithm is guaranteed to converge and optimizes a well-defined objective function. In addition, our method allows data with missing values to be used for parameter estimation and it is readily applied to data that is not specified by real numbers but for example by discrete variables. We present the results of several experiments to demonstrate our method and to compare it with Kohonen's self-organizing map.

In Chapter 5 we consider an approach for non-linear dimension reduction which is based on a combination of clustering and linear dimension reduction. This approach forms one global non-linear low dimensional data representation by combining multiple, locally valid, linear low dimensional representations. We derive an improvement of the original parameter estimation algorithm, which requires less computation and leads to better parameter estimates. We experimentally compare this approach to several other dimension reduction methods. We also apply this approach to a setting where high dimensional 'outputs' have to be predicted from high dimensional 'inputs'. Experimentally, we show that the considered non-linear approach leads to better predictions than a similar approach which also combines several local linear representations, but does not combine them into one global non-linear representation.

In Chapter 6 we summarize our conclusions and discuss directions for further research.

Mixture models for clustering and dimension reduction

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Partager