Skip to Main content Skip to Navigation


Abstract : Regression models for censored data suppose, like for any regression model, more observations than descriptors and descriptors not too much correlated between them. These hypotheses are not often proved to be true in practice and the standard approaches become therefore useless. It is the case, for example, in pharmacogenomic when the probability of patients' survival has to be estimated from profiles, or transcriptomic signatures, using expressions of thousand of genes. The objective of this thesis was to provide a solution to this problem in the framework of the PLS regression. The PLS-Cox model, which has been proposed, arises from a generalization of the PLS regression to any linear regression models. It provides a regularized alternative to survival regression models with highly dimensional data (p >> n). Moreover, the use of a “Kernel” re-parameterization of the PLS algorithms has allowed to develop solutions both very fast in the “large p, small n” paradigm, but also useful when dealing with non-linearity. Another solution to this problem, fast and simple to implement, has been developed based on the deviance residuals. In order to deal with data sets with missing values, an alternative to PLS-NIPALS has been proposed by introducing the concept of Multiple Imputation in simple and generalized PLS regression. Finally, a “Thresholding PLS” approach has been developed in order to select more parsimonious models.
Document type :
Complete list of metadatas
Contributor : Philippe Bastien <>
Submitted on : Monday, March 31, 2008 - 5:43:45 PM
Last modification on : Wednesday, March 20, 2019 - 4:52:06 PM
Long-term archiving on: : Friday, November 25, 2016 - 9:05:36 PM


  • HAL Id : tel-00268344, version 1



Philippe Bastien. RÉGRESSION PLS ET DONNÉES CENSURÉES. Sciences du Vivant [q-bio]. Conservatoire national des arts et metiers - CNAM, 2008. Français. ⟨tel-00268344⟩



Record views


Files downloads