Skip to Main content Skip to Navigation
Theses

Possibilistic Classifiers for Certain/Uncertain Numerical Data

Abstract : This thesis enters within the framework of Machine learning and concerns the study of a variety of classification methods for numerical data. The first issue in this work concerns the study of Rule based classification. In fact, rule induction algorithms suffer from two main drawbacks when classifying test examples: i) the multiple classification problems when many rules cover an example and are associated with different classes, and ii) the choice of a default class, which concerns the non-covering case. Our first contribution is to propose a family of Possibilistic Rule-based Classifiers (PRCs) to deal with such problems which are an extension and a modification of the PART algorithm. The PRCs keep the same rule learning step as PART, but differ in other respects. In particular, the PRCs learn fuzzy rules instead of crisp rules, consider weighted rules at deduction time in an unordered manner instead of rule lists and reduce the number of examples not covered by any rule using a fuzzy rule set with large supports. The experiments reported show that the PRCs lead to improve the accuracy of the classical PART algorithm. On the other hand Naive Bayesian Classifiers (NBC), which relies on independence hypotheses, together with a normality assumption to estimate densities for numerical data, are known for their simplicity and their effectiveness. However estimating densities, even under the normality assumption, may be problematic in case of poor data. In such a situation, possibility distributions may provide a more faithful representation of these data. A second contribution in this thesis focuses on the estimation of possibility distributions for continuous data. For this purpose we investigate two families of possibilistic classifiers. The first one is derived from classical or flexible Bayesian classifiers by applying a probability-possibility transformation to Gaussian distributions which introduces some further tolerance in the description of classes and gives place to the Naive Possibilistic Classifier (NPC) and the Flexible Naive Possibilistic Classifier (FNPC). In the same context, we also use a probability-possibility transformation method enabling us to derive a possibilistic distribution as a family of Gaussian distributions. We have proposed two other possibilistic classifiers; the NPC-2 and FNPC-2 which takes into account the confidence intervals of the Gaussian distributions. The second family of possibilistic classifiers abandons the normality assumption and has a direct representation of data. We propose two other classifiers named Fuzzy Histogram Classifier (FuHC) and Nearest Neighbor-based Possibilistic Classifier (NNPC) in this context. The two proposed classifiers exploit an idea of proximity between attribute values in order to estimate possibility distributions. The last issue in this thesis concerns the classification of data with continuous input variables in presence of uncertainty. We extend possibilistic classifiers that we have previously proposed for numerical data, in order to cope with uncertainty in data representation. We consider two types of uncertainty: i) the uncertainty associated with the class in the training set, which is modelled by a possibility distribution over class labels, and ii) the imprecision pervading attribute values in the testing set represented under the form of intervals for continuous data. We first adapt the possibilistic classification model, previously proposed for the certain case, in order to accommodate the uncertainty about class labels. Then, we propose an extension principle-based algorithm to deal with imprecise attribute values. Possibilistic classifiers are compared to classical or flexible Bayesian classifiers on a collection of benchmarks databases. The experiments reported show the efficiency of possibilistic classifiers to deal with certain or uncertain data. In particular, the probability-to-possibility transform-based classifiers show a robust behaviour when dealing with imperfect data.
Document type :
Theses
Complete list of metadatas

Cited literature [178 references]  Display  Hide  Download

https://tel.archives-ouvertes.fr/tel-02073768
Contributor : Myriam Bounhas <>
Submitted on : Wednesday, March 20, 2019 - 11:10:27 AM
Last modification on : Friday, March 22, 2019 - 1:56:35 PM
Long-term archiving on: : Friday, June 21, 2019 - 1:28:35 PM

File

These.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : tel-02073768, version 1

Citation

Myriam Bounhas. Possibilistic Classifiers for Certain/Uncertain Numerical Data. Machine Learning [cs.LG]. Université de Tunis, 2013. English. ⟨tel-02073768⟩

Share

Metrics

Record views

68

Files downloads

214