A closed patterns-based approach to the consensus clustering problem

Atheer Al-Najdi 1
1 Laboratoire d'Informatique, Signaux, et Systèmes de Sophia-Antipolis (I3S) / Projet MinD
Laboratoire I3S - SPARKS - Scalable and Pervasive softwARe and Knowledge Systems
Abstract : Clustering is the process of partitioning a dataset into groups, so that the instances in the same group are more similar to each other than to instances in any other group. Many clustering algorithms were proposed, but none of them proved to provide good quality partition in all situations. Consensus clustering aims to enhance the clustering process by combining different partitions obtained from different algorithms to yield a better quality consensus solution. In this work, a new consensus clustering method, called MultiCons, is proposed. It uses the frequent closed itemset mining technique in order to discover the similarities between the different base clustering solutions. The identified similarities are presented in a form of clustering patterns, that each defines the agreement between a set of base clusters in grouping a set of instances. By dividing these patterns into groups based on the number of base clusters that define the pattern, MultiCons generates a consensussolution from each group, resulting in having multiple consensus candidates. These different solutions are presented in a tree-like structure, called ConsTree, that facilitates understanding the process of building the multiple consensuses, and also the relationships between the data instances and their structuring in the data space. Five consensus functions are proposed in this work in order to build a consensus solution from the clustering patterns. Approach 1 is to just merge any intersecting clustering patterns. Approach 2 can either merge or split intersecting patterns based on a proposed measure, called intersection ratio
Document type :
Theses
Complete list of metadatas

Cited literature [62 references]  Display  Hide  Download

https://tel.archives-ouvertes.fr/tel-01478626
Contributor : Abes Star <>
Submitted on : Tuesday, February 28, 2017 - 11:37:22 AM
Last modification on : Monday, November 5, 2018 - 3:52:10 PM
Long-term archiving on : Monday, May 29, 2017 - 1:30:24 PM

File

2016AZUR4111.pdf
Version validated by the jury (STAR)

Identifiers

  • HAL Id : tel-01478626, version 1

Collections

Citation

Atheer Al-Najdi. A closed patterns-based approach to the consensus clustering problem. Other [cs.OH]. Université Côte d'Azur, 2016. English. ⟨NNT : 2016AZUR4111⟩. ⟨tel-01478626⟩

Share

Metrics

Record views

358

Files downloads

466