Skip to Main content Skip to Navigation

Evaluation of the use of public toxicological data for chemical hazard prediction through computational methods

Abstract : Currently, chemical safety assessment mostly relies on results obtained in in vivo studies performed in laboratory animals. However, these studies are costly in term of time, money and animals used and therefore not adapted for the evaluation of thousands of compounds. In order to rapidly screen compounds for their potential toxicity and prioritize them for further testing, alternative solutions are envisioned such as in vitro assays and computational predictive models. The objective of this thesis is to evaluate how the public data from ToxCast and ToxRefDB can allow the construction of this type of models in order to predict in vivo effects induced by compounds, only based on their chemical structure. To do so, after data pre-processing, we first focus on the prediction of in vitro bioactivity from chemical structure and then on the prediction of in vivo effects from in vitro bioactivity data. For the in vitro bioactivity prediction, we build and test various models based on compounds’ chemical structure descriptors. Since learning data are highly imbalanced in favor of non-toxic compounds, we test a data augmentation technique and show that it improves models’ performances. We also perform a largescale study to predict hundreds of in vitro assays from ToxCast and show that the stacked generalization ensemble method leads to reliable models when used on their applicability domain. For the in vivo effects prediction, we evaluate the link between results from in vitro assays targeting pathways known to induce endocrine effects and in vivo effects observed in endocrine organs during longterm studies. We highlight that, unexpectedly, these assays are not predictive of the in vivo effects, which raises the crucial question of the relevance of in vitro assays. We thus hypothesize that the selection of assays able to predict in vivo effects should be based on complementary information such as, in particular, mechanistic data.
Complete list of metadata

Cited literature [315 references]  Display  Hide  Download
Contributor : Abes Star :  Contact Connect in order to contact the contributor
Submitted on : Tuesday, May 19, 2020 - 4:01:18 PM
Last modification on : Monday, October 19, 2020 - 11:12:31 AM


Version validated by the jury (STAR)


  • HAL Id : tel-02612815, version 1



Ingrid Grenet. Evaluation of the use of public toxicological data for chemical hazard prediction through computational methods. Machine Learning [cs.LG]. Université Côte d'Azur, 2019. English. ⟨NNT : 2019AZUR4050⟩. ⟨tel-02612815⟩



Record views


Files downloads