Skip to Main content Skip to Navigation

Modélisation des arbres onco-généalogiques et application à la détermination de phénotypes cancéreux spécifiques favorisant une exploration génotypique ciblée

Abstract : In oncogenetics, the study of cancer cases in the family pedigree enables to orient the diagnosis towards particular mutations / associations of mutations, or to reject the hypothesis of genetic susceptibility for cancer in the family. If this diagnosis usually relies on the oncogeneticist, it is possible to propose an algorithmic approach to process the information contained in these pedigrees. Three methods have been developed for this purpose:• Use of the whole family pedigree as a model, then calculation of the mutation risk according to various assumptions and conservation of the most probable one, i.e. fitting best to the model.• Generation of sub-trees (skeleton containing for example all father-mother-son-daughter occurrences found in a tree) summarizing oncogenetic information and constitution by aggregation of family profiles. Determination of mutational risk by calculating the distance between subtrees and profiles.• Use of the statistical summary counting cases by type of cancer, by age of diagnosis as well as other synthetic demographic data (celibacy rate, fertility indices, early procreation, etc.). Processing of these summaries using principal component analysis (PCA) and hierarchical clustering in order to highlight groups of families with similar phenotype, likely to correspond to specific genotypes.These approaches were tested either on trees generated randomly using the known risks of breast/ovarian cancer induced by mutations of BRCA genes, or on the oncogenetic database of the Jean Perrin Comprehensive Cancer Center which contains several thousand pedigrees from families predisposed to cancer. This allowed us to determine in particular an optimal size for oncogenetic pedigrees. The generation of subtrees did not prove its superiority over the use of statistical summaries. With these latter, we have developed a doubly hierarchical clustering (H²C), the first level corresponding to the families themselves and the second to the members of the families. This H²C still requires some validation. Finally, the PCAs on the summaries allowed us to regroup families in an efficient manner, by clearly discriminating among the families predisposed to breast/ovarian cancer, the families with very penetrating mutations (BRCA genes) from other families in which the deleterious mutations should be on one or more other genes, yet not recognized as such.
Document type :
Complete list of metadata
Contributor : Abes Star :  Contact
Submitted on : Friday, October 29, 2021 - 4:48:33 PM
Last modification on : Wednesday, November 3, 2021 - 4:00:57 AM


Version validated by the jury (STAR)


  • HAL Id : tel-03409455, version 1


Fabrice Kwiatkowski. Modélisation des arbres onco-généalogiques et application à la détermination de phénotypes cancéreux spécifiques favorisant une exploration génotypique ciblée. Statistiques [math.ST]. Université Clermont Auvergne, 2020. Français. ⟨NNT : 2020CLFAC077⟩. ⟨tel-03409455⟩



Record views


Files downloads