Skip to Main content Skip to Navigation

Splines multidimensionnelles pénalisées pour modéliser le taux de survenue d’un événement : application au taux de mortalité en excès et à la survie nette en épidémiologie des maladies chroniques

Abstract : Time-to-event analysis is a very important field in statistics. When the event under study is death, the analysis focuses on the probability of survival of the subjects as well as on their mortality hazard, that is, on the "force of mortality" that applies at any given moment. Patients with a chronic disease usually have an excess mortality compared to a population that does not have the disease. Studying the excess mortality hazard associated with a disease and investigating the impact of prognostic factors on this hazard are important public health issues in epidemiology. From a statistical point of view, modelling the (excess) mortality hazard involves taking into account potentially non-linear and time-dependent effects of prognostic factors as well as their interactions. Regression splines (i.e., parametric and flexible piecewise polynomials) are ideal for dealing with such a complexity. They make it possible to build easily nonlinear effects and, regarding interactions between continuous variables, make it easy to form a multidimensional spline from two or more marginal one-dimensional splines. However, the flexibility of regression splines presents a risk of overfitting. To avoid this risk, penalized regression splines have been proposed as part of generalized additive models. Their principle is to associate each spline with one or more penalty terms controlled by smoothing parameters. The smoothing parameters represent the desired degrees of penalization. In practice, these parameters are unknown and have to be estimated just like the regression parameters. This thesis describes the development of a method to model the (excess) hazard using multidimensional penalized regression splines. Restricted cubic splines were used as one-dimensional splines or marginal bases to form multidimensional splines by tensor products. The optimization process relies on two nested Newton-Raphson algorithms. Smoothing parameter estimation is performed by optimizing a cross-validation criterion or the marginal likelihood of the smoothing parameters with an outer Newton-Raphson algorithm. At fixed smoothing parameters, the regression parameters are estimated by maximizing the penalized likelihood by an inner Newton-Raphson algorithm.The good properties of this approach in terms of statistical performance and numerical stability were then demonstrated through simulation. The described method was then implemented within the R package survPen. Finally, the method was applied to real data to investigate two epidemiological issues: the impact of social deprivation on the excess mortality in cervical cancer patients and the impact of the current age on the excess mortality in multiple sclerosis patients
Document type :
Complete list of metadatas
Contributor : Abes Star :  Contact
Submitted on : Thursday, November 14, 2019 - 3:23:53 PM
Last modification on : Monday, February 10, 2020 - 4:36:47 PM
Long-term archiving on: : Saturday, February 15, 2020 - 3:11:09 PM


Version validated by the jury (STAR)


  • HAL Id : tel-02363708, version 1



Mathieu Fauvernier. Splines multidimensionnelles pénalisées pour modéliser le taux de survenue d’un événement : application au taux de mortalité en excès et à la survie nette en épidémiologie des maladies chroniques. Bio-informatique [q-bio.QM]. Université de Lyon, 2019. Français. ⟨NNT : 2019LYSE1129⟩. ⟨tel-02363708⟩



Record views


Files downloads