Skip to Main content Skip to Navigation

Outil d'aide au diagnostic du cancer à partir d'extraction d'informations issues de bases de données et d'analyses par biopuces

Lyamine Hedjazi 1
1 LAAS-DISCO - Équipe DIagnostic, Supervision et COnduite
LAAS - Laboratoire d'analyse et d'architecture des systèmes
Abstract : Cancer is one of the most common causes of death in the world. Currently, breast cancer is the most frequent in female cancers. Although the significant improvement made last decades in cancer management, an accurate cancer management is still needed to help physicians take the necessary treatment decisions and thereby reducing its related adverse effects as well as its expensive medical costs. This work addresses the use of machine learning techniques to develop such tools of breast cancer management. Clinical factors, such as patient age and histo-pathological variables, are still the basis of dayto- day decision for cancer management. However, with the emergence of high throughput technology, gene expression profiling is gaining increasing attention to build more accurate predictive tools for breast cancer. Nevertheless, several challenges have to be faced for the development of such tools mainly (1) high dimensionality of data issued from microarray technology; (2) low signal-to-noise ratio in microarray measurement; (3) membership uncertainty of patients to cancer groups; and (4) heterogeneous (or mixed-type) data present usually in clinical datasets. In this work we propose some approaches to deal appropriately with such challenges. A first approach addresses the problem of high data dimensionality by taking use of ℓ1 learning capabilities to design an embedded feature selection algorithm for SVM (ℓ1 SVM) based on a gradient descent technique. The main idea is to transform the initial constrained convex optimization problem into an unconstrained one through the use of an approximated loss function. A second approach handles simultaneously all challenges and therefore allows the integration of several data sources (clinical, microarray ...) to build more accurate predictive tools. In this order a unified principle to deal with the data heterogeneity problem is proposed. This principle is based on the mapping of different types of data from initially heterogeneous spaces into a common space through an adequacy measure. To take into account membership uncertainty and increase model interpretability, this principle is proposed within a fuzzy logic framework. Besides, in order to alleviate the problem of high level noise, a symbolic approach is proposed suggesting the use of interval representation to model the noisy measurements. Since all data are mapped into a common space, they can be processed in a unified way whatever its initial type for different data analysis purposes. We particularly designed, based on this principle, a supervised fuzzy feature weighting approach. The weighting process is mainly based on the definition of a membership margin for each sample. It optimizes then a membership-margin based objective function using classical optimization approach to avoid combinatorial search. An extension of this approach to the unsupervised case is performed to develop a weighted fuzzy rule-based clustering algorithm. The effectiveness of all approaches has been assessed through extensive experimental studies and compared with well-know state-of-the-art methods. Finally, some breast cancer applications have been performed based on the proposed approaches. In particular, predictive and prognostic models were derived based on microarray and/or clinical data and compared with genetic and clinical based approaches.
Document type :
Complete list of metadata

Cited literature [247 references]  Display  Hide  Download
Contributor : Arlette Evrard <>
Submitted on : Monday, January 9, 2012 - 3:39:17 PM
Last modification on : Thursday, June 10, 2021 - 3:05:57 AM
Long-term archiving on: : Tuesday, April 10, 2012 - 2:36:05 AM


  • HAL Id : tel-00657959, version 1


Lyamine Hedjazi. Outil d'aide au diagnostic du cancer à partir d'extraction d'informations issues de bases de données et d'analyses par biopuces. Automatic Control Engineering. Université Paul Sabatier - Toulouse III, 2011. English. ⟨tel-00657959⟩



Record views


Files downloads