Skip to Main content Skip to Navigation
Theses

Mesures de discrimination et leurs applications en apprentissage inductif

Thanh Ha Dang 1
1 MALIRE - Machine Learning and Information Retrieval
LIP6 - Laboratoire d'Informatique de Paris 6
Abstract : Nowadays, the available data become more and more voluminous and diverse by nature: vague data, missing data, numerical or symbolic data can be encountered. However, users are more interested in the knowledge which can be extracted from the data, than by the data themselves. Vis-à-vis the great quantity of available data, the effective processing of data is very cumbersome. In this thesis we adopt an approach of knowledge extraction from data based on inductive learning, more precisely by using the decision tree technique. In general, the purpose of a system constructed by inductive learning is to discriminate the individuals belonging to different classes. Its quality depends on its discrimination power which is acquired during the learning phase through the data. In particular, an algorithm of construction of a decision tree works by successively evaluating the discrimination power of the attributes. In this thesis, we investigate the measures of discrimination, both classical and fuzzy, and their applications in inductive learning. On the one hand, we consider discrimination measures for the construction of decision trees. We begin by studying these measures following an axiomatic approach and develop a new model which permits to characterize fuzzy measures of discrimination. Then, we propose to use these measures during the various stages of construction of fuzzy decision trees. On the other hand, we study the use of these measures of discrimination during other steps of the learning process. Firstly, we examine the classifier evaluation process and propose an evaluation criteria based on the concept of discrimination power. Next, we consider the missing data problem and propose a new technique of imputation by restoring the discrimination power of attributes. This work is validated on conventional data and is applied to some real problems such as email classification and human-computer interaction traces classification.
Document type :
Theses
Complete list of metadatas

Cited literature [166 references]  Display  Hide  Download

https://tel.archives-ouvertes.fr/tel-00184691
Contributor : Thanh Ha Dang <>
Submitted on : Saturday, February 23, 2008 - 5:52:01 PM
Last modification on : Friday, January 8, 2021 - 5:34:11 PM
Long-term archiving on: : Tuesday, September 21, 2010 - 3:50:42 PM

Identifiers

  • HAL Id : tel-00184691, version 2

Citation

Thanh Ha Dang. Mesures de discrimination et leurs applications en apprentissage inductif. Interface homme-machine [cs.HC]. Université Pierre et Marie Curie - Paris VI, 2007. Français. ⟨tel-00184691v2⟩

Share

Metrics

Record views

821

Files downloads

413