Ensemble multi-label learning in supervised and semi-supervised settings

Ouadie Gharroudi 1, 2
Abstract : Multi-label learning is a specific supervised learning problem where each instance can be associated with multiple target labels simultaneously. Multi-label learning is ubiquitous in machine learning and arises naturally in many real-world applications such as document classification, automatic music tagging and image annotation. In this thesis, we formulate the multi-label learning as an ensemble learning problem in order to provide satisfactory solutions for both the multi-label classification and the feature selection tasks, while being consistent with respect to any type of objective loss function. We first discuss why the state-of-the art single multi-label algorithms using an effective committee of multi-label models suffer from certain practical drawbacks. We then propose a novel strategy to build and aggregate k-labelsets based committee in the context of ensemble multi-label classification. We then analyze the effect of the aggregation step within ensemble multi-label approaches in depth and investigate how this aggregation impacts the prediction performances with respect to the objective multi-label loss metric. We then address the specific problem of identifying relevant subsets of features - among potentially irrelevant and redundant features - in the multi-label context based on the ensemble paradigm. Three wrapper multi-label feature selection methods based on the Random Forest paradigm are proposed. These methods differ in the way they consider label dependence within the feature selection process. Finally, we extend the multi-label classification and feature selection problems to the semi-supervised setting and consider the situation where only few labelled instances are available. We propose a new semi-supervised multi-label feature selection approach based on the ensemble paradigm. The proposed model combines ideas from co-training and multi-label k-labelsets committee construction in tandem with an inner out-of-bag label feature importance evaluation. Satisfactorily tested on several benchmark data, the approaches developed in this thesis show promise for a variety of applications in supervised and semi-supervised multi-label learning
Document type :
Theses
Complete list of metadatas

https://tel.archives-ouvertes.fr/tel-01736344
Contributor : Abes Star <>
Submitted on : Friday, March 16, 2018 - 5:59:07 PM
Last modification on : Thursday, November 21, 2019 - 2:26:47 AM
Long-term archiving on: Tuesday, September 11, 2018 - 7:20:28 AM

File

TH2017GHARROUDIOUADIE.pdf
Version validated by the jury (STAR)

Identifiers

  • HAL Id : tel-01736344, version 1

Citation

Ouadie Gharroudi. Ensemble multi-label learning in supervised and semi-supervised settings. Artificial Intelligence [cs.AI]. Université de Lyon, 2017. English. ⟨NNT : 2017LYSE1333⟩. ⟨tel-01736344⟩

Share

Metrics

Record views

712

Files downloads

1290