A. Abdelaziz, H. Spahn-langguth, K. Schramm, and I. V. Tetko, Consensus Modeling for HTS Assays Using In silico Descriptors Calculates the Best Balanced Accuracy in Tox21 Challenge, Frontiers in Environmental Science, vol.4, issue.2, 2016.

C. H. Allen, A. Koutsoukas, I. Cortés-ciriano, D. S. Murrell, T. E. Malliavin et al., Improving the prediction of organism-level toxicity through integration of chemical, protein target and cytotoxicity qHTS data, Toxicology Research, vol.5, issue.3, pp.883-894, 2016.
URL : https://hal.archives-ouvertes.fr/hal-01907207

E. Alpaydin, Introduction to machine learning, 2009.

D. Altshuler, R. M. Durbin, G. R. Abecasis, D. R. Bentley, A. Chakravarti et al.,

. Cartwright, A map of human genome variation from population-scale sequencing, Nature, vol.467, pp.1061-1073, 2010.

K. Ambe, K. Ishihara, T. Ochibe, K. Ohya, S. Tamura et al., In Silico Prediction of Chemical-Induced Hepatocellular Hypertrophy Using Molecular Descriptors, Toxicological Sciences, vol.162, issue.2, pp.667-675, 2018.

M. E. Andersen and D. Krewski, Toxicity Testing in the 21st Century: Bringing the Vision to Life, Toxicological Sciences, vol.107, issue.2, pp.324-330, 2009.

M. E. Andersen and D. Krewski, The Vision of Toxicity Testing in the 21st Century: Moving from Discussion to Action, Toxicological Sciences, vol.117, issue.1, pp.17-24, 2010.

G. T. Ankley, R. S. Bennett, R. J. Erickson, D. J. Hoff, M. W. Hornung et al.,

. Villeneuve, Adverse outcome pathways: A conceptual framework to support ecotoxicology research and risk assessment, Environmental Toxicology and Chemistry, vol.29, issue.3, pp.730-741, 2010.

N. Baker, A. Boobis, L. Burgoon, E. Carney, R. Currie et al., Building a developmental toxicity ontology, Birth Defects Research, vol.110, pp.502-518, 2018.

P. Banerjee, V. B. Siramshetty, M. N. Drwal, and R. Preissner, Computational methods for prediction of in vitro effects of new chemical structures, Journal of Cheminformatics, vol.8, issue.1, p.51, 2016.

I. I. Baskin, Machine Learning Methods in Computational Toxicology, pp.119-139

G. E. Batista and M. C. Monard, An analysis of four missing data treatment methods for supervised learning, Applied Artificial Intelligence, vol.17, issue.5-6, pp.519-533, 2003.

R. A. Becker, D. A. Dreier, M. K. Manibusan, L. A. Cox, T. W. Simon et al., How well can carcinogenicity be predicted by high throughput "characteristics of carcinogens" mechanistic data?, Regulatory Toxicology and Pharmacology, vol.90, pp.185-196, 2017.

S. M. Bell, X. Chang, J. F. Wambaugh, D. G. Allen, M. Bartels et al.,

N. Casey, S. S. Choksi, G. Ferguson, and . Fraczkiewicz, In vitro to in vivo extrapolation for high throughput prioritization and decision making, Toxicology in Vitro, vol.47, pp.213-227, 2018.

E. Benfenati, The CAESAR project for in silico models for the REACH legislation, Chemistry Central journal, 4, 2010.

E. Benfenati, R. Benigni, D. M. Demarini, C. Helma, D. Kirkland et al., Predictive Models for Carcinogenicity and Mutagenicity: Frameworks, Stateof-the-Art, and Perspectives, Journal of Environmental Science and Health, vol.27, issue.2, pp.57-90, 2009.

E. Benfenati, A. Manganaro, and G. Gini, VEGA-QSAR: AI inside a platform for predictive toxicology, CEUR Workshop Proceedings, vol.1107, p.2013

R. Benigni, Structure-Activity Relationship Studies of Chemical Mutagens and Carcinogens: Mechanistic Investigations and Prediction Approaches, Chemical Reviews, vol.105, issue.5, pp.1767-1800, 2005.

R. Benigni, C. L. Battistelli, C. Bossa, A. Giuliani, and O. Tcheremenskaia, Endocrine Disruptors: Data-based survey of in vivo tests, predictive models and the Adverse Outcome Pathway, Regulatory Toxicology and Pharmacology, vol.86, pp.18-24, 2017.

R. Benigni, C. Laura, B. , C. Bossa, A. Giuliani et al., Evaluation of the applicability of existing (Q)SAR models for predicting the genotoxicity of pesticides and similarity 160

, analysis related with genotoxicity of pesticides for facilitating of grouping and read across, vol.16, p.1598, 2019.

A. P. Bento, A. Gaulton, A. Hersey, B. Al-lazikani, D. Michalovich et al., ChEMBL: a large-scale bioactivity database for drug discovery, Nucleic Acids Research, vol.40, issue.D1, pp.1100-1107, 2011.

M. R. Berthold, N. Cebron, F. Dill, T. R. Gabriel, T. Kötter et al., KNIME: The Konstanz Information Miner, Studies in Classification, Data Analysis, and Knowledge Organization, 2007.

A. Bitsch, S. Escher, G. Lewin, C. Melber, N. Simetska et al., RepDose and FeDTex: Two databases focusing on systemic toxicity: First examples from analyses of repeated dose toxicity and reprotoxicity studies, Toxicology Letters, vol.180, pp.202-210, 2008.

A. Bitsch, S. Jacobi, C. Melber, U. Wahnschaffe, N. Simetska et al., REP-DOSE: A database on repeated dose toxicity studies of commercial chemicals-A multifunctional tool, Regulatory Toxicology and Pharmacology, vol.46, issue.3, pp.202-210, 2006.

M. Bodén, A Guide to Recurrent Neural Networks and Backpropagation, The Dallas project, 2002.

B. Boezio, K. Audouze, P. Ducrot, and O. Taboureau, Network-based Approaches in Pharmacology, Molecular Informatics, vol.36, issue.10, p.1700048, 2017.

A. R. Boobis, J. E. Doe, B. Heinrich-hirsch, M. E. Meek, S. Munn et al., IPCS framework for analyzing the relevance of a noncancer mode of action for humans, Critical reviews in toxicology, vol.38, issue.2, pp.87-96, 2008.

M. Bouhifd, M. E. Andersen, C. Baghdikian, K. Boekelheide, K. M. Crofton et al., The Human Toxome Project. Alternatives to animal experimentation, vol.32, pp.112-124, 2015.

P. B. Brazdil and C. Soares, A comparison of ranking methods for classification algorithm selection, European Conference on Machine Learning, pp.63-75, 2000.

L. Breiman, Bagging predictors, Machine Learning, vol.24, pp.123-140, 1996.

L. Breiman, Random Forests, Machine Learning, vol.45, pp.5-32, 2001.

C. Bron and J. Kerbosch, Algorithm 457: finding all cliques of an undirected graph, Communications of the ACM, vol.16, issue.9, pp.575-577, 1973.

P. Browne, R. S. Judson, W. M. Casey, N. C. Kleinstreuer, and R. S. Thomas, Screening Chemicals for Estrogen Receptor Bioactivity Using a Computational Model, Environmental Science & Technology, vol.49, issue.14, pp.8804-8814, 2015.

J. S. Bus and R. A. Becker, Toxicity Testing in the 21st Century: A View from the Chemical Industry, Toxicological Sciences, vol.112, issue.2, pp.297-302, 2009.

F. Caiment, M. Tsamou, D. Jennen, and J. Kleinjans, Assessing compound carcinogenicity in vitro using connectivity mapping, Carcinogenesis, vol.35, issue.1, pp.201-207, 2014.

S. Capuzzi, R. Politi, O. Isayev, S. Farag, and A. Tropsha, QSAR Modeling of Tox21 Challenge Stress Response and Nuclear Receptor Signaling Toxicity Assays, Frontiers in Environmental Science, vol.4, issue.43, pp.3389-3392, 2016.

A. Cereto-massagué, M. J. Ojeda, C. Valls, M. Mulero, S. Garcia-vallvé et al., Molecular fingerprint similarity search in virtual screening, Methods, vol.71, pp.58-63, 2015.

K. J. Chandler, M. Barrier, S. Jeffay, H. P. Nichols, N. C. Kleinstreuer et al., Evaluation of 309 Environmental Chemicals Using a Mouse Embryonic Stem Cell Adherent Cell Differentiation and Cytotoxicity Assay, PLoS ONE, vol.6, issue.6, p.18540, 2011.

C. Chang and C. Lin, LIBSVM: a library for support vector machines, ACM Transactions on Intelligent Systems and Technology, vol.2, issue.3, p.27, 2011.

X. Chang, N. Kleinstreuer, P. Ceger, J. Hsieh, D. Allen et al., Application of Reverse Dosimetry to, Compare In Vitro and In Vivo Estrogen Receptor Activity. Applied In Vitro Toxicology, vol.1, issue.1, pp.33-44, 2015.

O. Chapelle, B. Scholkopf, and A. Zien, Semi-supervised learning, IEEE Transactions on Neural Networks, vol.20, issue.3, pp.542-542, 2009.

N. Chawla, K. Bowyer, L. O. Hall, and P. W. Kegelmeyer, SMOTE: Synthetic Minority Over-sampling Technique, Journal of Artificial Intelligence Research, vol.16, pp.321-357, 2002.

N. Chawla, N. Japkowicz, and A. Kotcz, Special issue on learning from imbalanced data sets, ACM Sigkdd Explorations Newsletter, vol.6, issue.1, pp.1-6, 2004.

B. Chazelle, An optimal convex hull algorithm in any fixed dimension. Discrete & Computational Geometry, vol.10, pp.377-409, 1993.

J. J. Chen, C. Tsai, H. Moon, H. Ahn, J. J. Young et al., Decision threshold adjustment in class prediction, SAR and QSAR in environmental research, vol.17, pp.337-52, 2006.

T. Chen and C. Guestrin, Xgboost: A scalable tree boosting system, Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, pp.785-794, 2016.

Y. Chen, F. Cheng, L. Sun, W. Li, G. Liu et al., Computational models to predict endocrine-disrupting chemical binding with androgen or oestrogen receptors, Ecotoxicology and Environmental Safety, vol.110, pp.280-287, 2014.

A. Cherkasov, E. N. Muratov, D. Fourches, A. Varnek, I. I. Baskin et al., QSAR Modeling: Where Have You Been? Where Are You Going To, vol.57, pp.4977-5010, 2014.

F. Chollet, , 2015.

S. Choudhary, A. Walker, K. Funk, C. Keenan, I. Khan et al., The standard for the exchange of nonclinical data (SEND): Challenges and promises, Toxicologic Pathology, vol.46, issue.8, pp.1006-1012, 2018.

H. Ciallella and H. Zhu, Advancing Computational Toxicology in the Big Data Era by Artificial Intelligence: Data-Driven and Mechanism-Driven Modeling for Chemical Toxicity, Chemical Research in Toxicology, 2019.

S. M. Cohen, A. R. Boobis, V. L. Dellarco, J. E. Doe, P. A. Fenner-crisp et al., Chemical carcinogenicity revisited 3: Risk assessment of carcinogenic potential based on the current state of knowledge of carcinogenesis in humans, Regulatory Toxicology and Pharmacology, vol.103, pp.100-105, 2019.

. Efsa-scientific-committee, Scientific opinion on the hazard assessment of endocrine disruptors: scientific criteria for identification of endocrine disruptors and appropriateness of existing test methods for assessing effects mediated by these substances on human health and the environment, EFSA Journal, vol.11, issue.3, p.3132, 2013.

, Risk Assessment in the Federal Government: Managing the Process, 1983.

, Toxicity Testing in the 21st Century: A Vision and a Strategy, 2007.

, National Research Council et al. Science and judgment in risk assessment, 1994.

M. T. Cronin, Chapter 1 an introduction to chemical grouping, categories and read-across to predict toxicity, Chemical Toxicity Prediction: Category Formation and Read-Across, pp.1-29, 2013.

M. Daneshian, H. Kamp, J. Hengstler, M. Leist, and B. Van-de-water, Highlight report: Launch of a large integrated European in vitro toxicology project: EU-ToxRisk, Archives of Toxicology, vol.90, issue.5, pp.1021-1024, 2016.

D. Systèmes and B. ,

R. De-maesschalck, D. Jouan-rimbaud, and D. L. Massart, The mahalanobis distance. Chemometrics and intelligent laboratory systems, vol.50, pp.1-18, 2000.

J. C. Dearden, M. T. Cronin, and K. L. Kaiser, How not to develop a quantitative structure-activity or structure-property relationship (QSAR/QSPR), SAR and QSAR in Environmental Research, vol.20, issue.3-4, pp.241-266, 2009.

V. L. Dellarco, D. Mcgregor, S. Co, S. M. Berry, A. R. Cohen et al., Thiazopyr and Thyroid Disruption: Case Study Within the Context of the 2006 IPCS Human Relevance Framework for Analysis of a Cancer Mode of Action, Critical Reviews in Toxicology, vol.36, issue.10, pp.793-801, 2006.

A. Dey, Machine learning algorithms: a review, International Journal of Computer Science and Information Technologies, vol.7, issue.3, pp.1174-1179, 2016.

E. Diamanti-kandarakis, J. P. Bourguignon, L. C. Giudice, R. Hauser, G. S. Prins et al.,

R. T. Soto, A. C. Zoeller, and . Gore, Endocrine-Disrupting Chemicals: An Endocrine Society Scientific Statement, Endocrine Reviews, vol.30, issue.4, pp.293-342, 2009.

T. G. Dietterich, Ensemble methods in machine learning, Multiple Classifier Systems, pp.1-15, 2000.

S. D. Dimitrov, R. Diderich, T. Sobanski, T. S. Pavlov, G. V. Chankov et al., QSAR Toolbox -workflow and major functionalities, SAR and QSAR in Environmental Research, vol.27, issue.3, pp.203-219, 2016.

D. Ding, L. Xu, H. Fang, H. Hong, R. Perkins et al., The EDKB: an established knowledge base for endocrine disrupting chemicals, BMC Bioinformatics, vol.11, issue.6, p.5, 2010.

D. J. Dix, K. A. Houck, M. T. Martin, A. M. Richard, R. W. Setzer et al., The ToxCast Program for Prioritizing Toxicity Testing of Environmental Chemicals, Toxicological Sciences, vol.95, issue.1, pp.5-12, 2007.

J. E. Doe, A. R. Boobis, V. Dellarco, P. A. Fenner-crisp, A. Moretto et al.,

J. G. Schoeny, D. C. Seed, and . Wolf, Chemical carcinogenicity revisited 2: Current knowledge of carcinogenesis shows that categorization as a carcinogen or non-carcinogen is not scientifically credible, Regulatory Toxicology and Pharmacology, vol.103, pp.124-129, 2019.

N. Draper and H. Smith, Applied regression analysis, vol.326, 2014.

M. Drwal, V. Siramshetty, P. Banerjee, A. Goede, R. Preissner et al., Molecular similarity-based predictions of the Tox21 screening outcome, Frontiers in Environmental Science, vol.3, p.54, 2015.

H. Du, Y. Cai, H. Yang, H. Zhang, X. Yuhan et al., In Silico Prediction of Chemicals Binding to Aromatase with, Machine Learning Methods. Chemical Research in Toxicology, vol.30, pp.1209-1218, 2017.

A. Dudek, T. Arodz, and J. Gálvez, Computational Methods in Developing Quantitative Structure-Activity Relationships (QSAR): A Review. Combinatorial Chemistry & High Throughput Screening, vol.9, pp.213-228, 2006.

J. L. Durant, B. A. Leland, D. R. Henry, and J. G. Nourse, Reoptimization of MDL Keys for Use in Drug Discovery, Journal of Chemical Information and Computer Sciences, vol.42, issue.6, pp.1273-1280, 2002.

, Guidance for the identification of endocrine disruptors in the context of, European Chemical Agency (ECHA) and European Food Safety Authority (EFSA) with the technical support of the Joint Research Centre (JRC), p.16

F. Eduati, L. M. Mangravite, T. Wang, H. Tang, J. C. Bare et al., Prediction of human population responses to toxic compounds by a collaborative competition, Nature Biotechnology, vol.33, issue.9, pp.933-940, 2015.
URL : https://hal.archives-ouvertes.fr/hal-01246684

, Guidance on the establishment of the residue definition for dietary risk assessment, EFSA Journal, vol.14, 2016.

T. Eissing, A Computational Systems Biology Software Platform for Multiscale Modeling and Simulation: Integrating Whole-Body Physiology, Disease Biology, and Molecular Reaction Networks, vol.2, 2011.

C. Elkan, The Foundations of Cost-Sensitive Learning, Proceedings of the Seventeenth International Joint Conference on Artificial Intelligence, pp.973-978, 2001.

S. Enoch, Chemical Category Formation and Read-Across for the Prediction of Toxicity, pp.209-219, 2010.

, Guidance for the setting of an acute reference dose (ARfD), 2001.

, concerning the Registration, Evaluation, Authorization, and Restriction of Chemicals (REACH), establishing a European Chemicals Agency, amending Directive 1999/45/EC and repealing Council Regulation (EEC) No 793/93 and Commission Regulation (EC) No 1488/94 as well as Council Directive 76/769/EEC and Commission Directives 91/155/EC, European Commission. Regulation, 1907.

, on classification, labelling and packaging of substances and mixtures, amending and repealing Directives 67/548/EEC and 1999/45/EC, and amending Regulation (EC) No, vol.16, 1907.

, Regulation (EC) No 1107/2009 of the European Parliament and of the Council of 21 October 2009 concerning the placing of plant protection products on the market and repealing Council Directives, European Commission

E. Fabian, C. Gomes, B. Birk, T. Williford, T. R. Hernandez et al., In vitro-to-in vivo extrapolation (IVIVE) by PBTK modeling for animal-free risk assessment approaches of potential endocrine-disrupting compounds, Archives of Toxicology, vol.93, issue.2, pp.401-416, 2019.

T. Fawcett and F. Provost, Adaptive fraud detection, Data Mining Knowledge Discovery, vol.1, issue.3, pp.291-316, 1997.

M. Feher and T. Ewing, Global or local QSAR: Is there a way out?, QSAR & Combinatorial Science, vol.28, pp.850-855, 2009.

P. A. Fenner-crisp, A. F. Maciorowski, and G. E. Timm, The endocrine disruptor screening program developed by the us environmental protection agency, Ecotoxicology, vol.9, issue.1-2, pp.85-91, 2000.

M. R. Fielden, A. Adai, R. T. Dunn, A. Olaharski, G. Searfoss et al., Development and evaluation of a genomic signature for the prediction and mechanistic assessment of nongenotoxic hepatocarcinogens in the rat, Toxicological Sciences, 2011.

D. L. Filer, P. Kothiya, R. Setzer, R. Judson, and M. Martin, The ToxCast Pipeline for High-Throughput Screening Data, vol.33, pp.618-320, 2016.

R. B. Fitzpatrick, CPDB: Carcinogenic Potency Database, Medical Reference Services Quarterly, vol.27, issue.3, pp.303-311, 2008.

, United Nations Economic Commission for European Secretariat. Globally Harmonized System of Classification and Labelling of Chemicals (GHS), 2009.

D. Fourches, E. Muratov, and A. Tropsha, Trust, but verify: On the importance of chemical structure curation in cheminformatics and QSAR modeling research, vol.50, pp.1189-1204, 2010.

E. A. Freeman and G. G. Moisen, A comparison of the performance of threshold criteria for binary classification in terms of predicted prevalence and kappa, Ecological Modelling, vol.217, issue.1-2, pp.48-58, 2008.

J. H. Friedman, Greedy function approximation: a gradient boosting machine, Annals of statistics, pp.1189-1232, 2001.

D. Gadaleta, S. Manganelli, A. Roncaglioni, C. Toma, E. Benfenati et al., QSAR Modeling of ToxCast Assays Relevant to the Molecular Initiating Events of AOPs Leading to Hepatic Steatosis, Journal of Chemical Information and Modeling, vol.58, issue.8, pp.1501-1517, 2018.
URL : https://hal.archives-ouvertes.fr/ineris-02006100

F. Gatnik and A. Worth, Review of Software Tools for Toxicity Prediction, European Commision JRC, 2010.

E. Gelenbe, Stability of the random neural network model, Neural Computation, vol.2, issue.2, pp.239-247, 1990.

E. Gelenbe, Learning in the recurrent random neural network, Neural Computation, vol.5, issue.1, pp.154-164, 1993.

E. Gelenbe and S. Timotheou, Random neural networks with synchronized interactions, Neural Computation, vol.20, pp.2308-2324, 2008.

P. Geurts, Bias vs Variance Decomposition for Regression and Classification, pp.733-746, 2010.

A. K. Ghose, V. N. Viswanadhan, and J. J. Wendoloski, Prediction of Hydrophobic (Lipophilic) Properties of Small Organic Molecules Using Fragmental Methods: An Analysis of ALOGP and CLOGP Methods, The Journal of Physical Chemistry A, vol.102, issue.21, pp.3762-3772, 1998.

T. Gocht, E. Berggren, H. Ahr, I. Cotgreave, M. Cronin et al., The SEURAT-1 approach towards animal free human safety assessment. Alternatives to animal experimentation, vol.32, pp.9-24, 2015.

M. Goodarzi, B. Dejaegher, and Y. V. Heyden, Feature Selection Methods in QSAR Studies, Journal of AOAC International, vol.95, issue.3, pp.636-651, 2012.

I. Grenet, J. Comet, F. Schorsch, N. Ryan, J. Wicharg et al., Chemical in vitro bioactivity profiles are not informative about the long-term in vivo endocrine mediated toxicity, 2019.
URL : https://hal.archives-ouvertes.fr/hal-02540249

I. Grenet, K. Merlo, J. P. Comet, R. Tertiaux, D. Rouquié et al., Stacked Generalization with Applicability Domain Outperforms Simple QSAR on in Vitro Toxicological Data, Journal of Chemical Information and Modeling, 2019.
URL : https://hal.archives-ouvertes.fr/hal-02051775

I. Grenet, Y. Yin, and J. P. Comet, G-Networks to Predict the Outcome of Sensing of Toxicity, Sensors, vol.18, issue.10, p.3483, 2018.
URL : https://hal.archives-ouvertes.fr/hal-02051779

I. Grenet, Y. Yin, J. P. Comet, and E. Gelenbe, Machine Learning to Predict Toxicity of Compounds, The 27th International Conference on Artificial Neural Networks (ICANN), pp.335-345, 2018.
URL : https://hal.archives-ouvertes.fr/hal-02051852

C. J. Grondin, D. Sciaky, J. Wiegers, R. J. Johnson, T. C. Wiegers et al., The Comparative Toxicogenomics Database: update 2019, Nucleic Acids Research, vol.47, issue.D1, pp.948-954, 2018.

D. Guan, K. Fan, I. Spence, and S. Matthews, Combining machine learning models of in vitro and in vivo bioassays improves rat carcinogenicity prediction, Regulatory Toxicology and Pharmacology, vol.94, pp.8-15, 2018.

I. Guyon and A. Elisseeff, An Introduction to Variable and Feature Selection, Journal of Machine Learning Research, vol.3, pp.1157-1182, 2003.

D. Gómez and A. Rojas, An Empirical Overview of the No Free Lunch Theorem and Its Effect on Real-World Machine Learning Classification, Neural Computation, vol.28, issue.1, pp.216-228, 2016.

F. Güne?, R. Wolfinger, and P. Tan, Stacked ensemble models for improved prediction accuracy, SAS Global Forum Proceedings, 2017.

H. He and E. A. Garcia, Learning from Imbalanced Data, IEEE Transactions on Knowledge and Data Engineering, vol.21, issue.9, pp.1263-1284, 2009.

L. H. Hall and L. B. Kier, Electrotopological State Indices for Atom Types: A Novel Combination of Electronic, Topological, and Valence State Information, Journal of Chemical Information and Computer Sciences, vol.35, issue.6, pp.1039-1045, 1995.

H. Han, W. Wang, and B. Mao, Borderline-SMOTE: A New Over-Sampling Method in Imbalanced Data Sets Learning, Advances in Intelligent Computing, pp.878-887

L. Han, Y. Wang, and S. H. Bryant, Developing and validating predictive decision tree models from mining chemical structural fingerprints and high-throughput screening data in PubChem, BMC Bioinformatics, vol.9, issue.1, p.401, 2008.

C. E. Handford, C. T. Elliott, and K. Campbell, A review of the global pesticide legislation and the scale of challenge in reaching the global harmonization of food safety standards, Integrated Environmental Assessment and Management, vol.11, issue.4, pp.525-536, 2015.

C. Hansch, Quantitative structure-activity relationships and the unnamed science, Accounts of Chemical Research, vol.26, issue.4, pp.147-153, 1993.

C. Hansch, P. Maloney, T. Fujita, and R. Muir, Correlation of Biological Activity of Phenoxyacetic Acids with Hammett Substituent Constants and Partition Coefficients, Nature, vol.194, pp.178-180, 1962.

B. Hardy, G. Apic, P. Carthew, D. Clark, D. Cook et al., Toxicology Ontology Perspectives. Alternatives to animal experimentation, pp.139-156, 2012.

L. Harland, Open PHACTS: A Semantic Knowledge Infrastructure for Public and Commercial Drug Discovery Research, Knowledge Engineering and Knowledge Management, pp.1-7, 2012.

H. He, Y. Bai, E. A. Garcia, and S. Li, ADASYN: Adaptive synthetic sampling approach for imbalanced learning, IEEE International Joint Conference on Neural Networks, pp.1322-1328, 2008.

E. Helgee, L. Carlsson, S. Boyer, and U. Norinder, Evaluation of Quantitative Structure Activity Relationship Modeling Strategies: Local and Global Models, Journal of Chemical Information and Modeling, vol.50, issue.4, pp.677-689, 2010.

S. Heller, A. Mcnaught, S. Stein, D. Tchekhovskoi, and I. Pletnev, InChI -the worldwide chemical structure identifier standard, Journal of Cheminformatics, vol.5, issue.1, p.7, 2013.

V. J. Hodge and J. Austin, A survey of outlier detection methodologies, Artificial Intelligence Review, vol.22, issue.2, pp.85-126, 2004.

M. Hossin and M. N. Sulaiman, A review on evaluation metrics for data classification evaluations, International Journal of Data Mining & Knowledge Management Process, vol.5, issue.2, p.1, 2015.

C. W. Hsu, R. Huang, M. S. Attene-ramos, C. Austin, A. Simeonov et al., Advances in high-throughput screening technology for toxicology, International Journal of Risk Assessment and Management, vol.20, issue.1/2/3, pp.109-135, 2017.

R. Huang and M. Xia, Tox21 challenge to build predictive models of nuclear receptor and stress response pathways as mediated by exposure to environmental toxicants and drugs, Frontiers in Environmental Science, vol.5, p.85, 2017.

R. Huang, M. Xia, S. Sakamuru, J. H. Zhao, S. Shahane et al.,

A. Austin and . Simeonov, Modelling the Tox21 10K chemical profiles for in vivo toxicity prediction and mechanism characterization, Nature communications, 2016.

E. A. Hubal, A. Richard, L. Aylward, S. Edwards, J. Gallagher et al., Advancing Exposure Characterization for Chemical Evaluation and Risk Assessment, Journal of Toxicology and Environmental Health, vol.13, issue.2-4, pp.299-313, 2010.

Y. Igarashi, N. Nakatsu, T. Yamashita, A. Ono, Y. Ohno et al., Open TG-GATEs: A large-scale toxicogenomics database, Nucleic acids research, vol.43, pp.921-927, 2014.

, Global Assessment of the State-of-Science of Endocrine Disruptors. Geneva: World Health Organization, IPCS, 2002.

. Ipcs-&-oecd, IPCS risk assessment terminology. Geneva: World Health Organization, 2004.

M. Jamei, S. Marciniak, K. Feng, A. Barnett, G. Tucker et al., The Simcyp Population-based ADME Simulator, Expert Opinion on Drug Metabolism & Toxicology, vol.5, issue.2, pp.211-223, 2009.

J. Jaworska, M. Comber, C. Auer, and K. Van-leeuwen, Summary of a Workshop on Regulatory Acceptance of (Q)SARs for Human Health and Environmental Endpoints. Environmental health perspectives, vol.111, pp.1358-1360, 2003.

J. Jaworska, N. Nikolova-jeliazkova, and T. Aldenberg, QSAR applicabilty domain estimation by projection of the training set descriptor space: a review. Alternatives to laboratory animals, vol.33, pp.445-459, 2005.

A. Jovi?, K. Brki?, and N. Bogunovi?, A review of feature selection methods with applications, 38th International Convention on Information and Communication Technology, Electronics and Microelectronics (MIPRO), pp.1200-1205, 2015.

R. Judson, K. Houck, R. Kavlock, T. Knudsen, M. Martin et al., Vitro Screening of Environmental Chemicals for Targeted Testing Prioritization: The ToxCast Project, vol.118, pp.485-492, 2010.

R. Judson, A. Richard, D. Dix, K. Houck, F. Elloumi et al., ACToR -Aggregated Computational Toxicology Resource, Toxicology and Applied Pharmacology, vol.233, issue.1, pp.7-13, 2008.

R. S. Judson, R. J. Kavlock, M. T. Martin, D. M. Reif, K. A. Houck et al.,

W. Fowle, W. Wooge, D. Tong, and . Dix, Perspectives on validation of high-throughput assays supporting 21st century toxicity testing. Alternatives to animal experimentation, vol.30, pp.51-57, 2013.

R. S. Judson, F. M. Magpantay, V. Chickarmane, C. Haskell, N. Tania et al., Integrated Model of Chemical Perturbations of a Biological Pathway Using 18 In Vitro High-Throughput Screening Assays for the Estrogen Receptor, Toxicological sciences : an official journal of the Society of Toxicology, vol.148, issue.1, pp.137-54, 2015.

E. R. Kabir, M. S. Rahman, and I. Rahman, A review on endocrine disruptors and their possible impacts on human health, Environmental Toxicology and Pharmacology, vol.40, issue.1, pp.241-258, 2015.

R. M. Karp, Reducibility among Combinatorial Problems, pp.85-103, 1972.

C. M. Keenan, J. F. Baker, A. E. Bradley, D. G. Goodman, T. Harada et al., International Harmonization of Nomenclature and Diagnostic Criteria (IN-HAND) progress to date and future plans, Journal of Toxicologic Pathology, vol.28, issue.1, pp.51-53, 2015.

M. Kim, R. Huang, A. Sedykh, W. Wang, M. Xia et al., Mechanism Profiling of Hepatotoxicity Caused by Oxidative Stress Using the Antioxidant Response Element Reporter Gene Assay Models and Big Data, Environmental Health Perspectives, vol.124, pp.634-641, 2015.

S. Kim, L. Han, B. Yu, V. D. Hähnke, E. E. Bolton et al., PubChem structureactivity relationship (SAR) clusters, Journal of Cheminformatics, 2015.

W. D. Klaren, C. Ring, J. E. Rager, C. M. Thompson, M. A. Harris et al., Identifying Attributes That Influence In Vitro-to-In Vivo Concordance by Comparing In Vitro Tox21 Bioactivity Versus In Vivo DrugMatrix Transcriptomic Responses Across 130 Chemicals, Toxicological Sciences, vol.167, issue.1, pp.157-171, 2018.

N. C. Kleinstreuer, P. Ceger, E. D. Watt, M. Martin, K. Houck et al., Development and Validation of a Computational Model for Androgen Receptor Activity, Chemical Research in Toxicology, vol.30, issue.4, pp.946-964, 2017.

N. C. Kleinstreuer, D. J. Dix, K. A. Houck, R. J. Kavlock, T. B. Knudsen et al.,

D. M. Paul, K. M. Reif, K. Crofton, R. Hamilton, I. Hunter et al., In Vitro Perturbations of Targets in Cancer Hallmark Processes Predict Rodent Chemical Carcinogenesis, Toxicological Sciences, vol.131, issue.1, pp.40-55, 2013.

N. C. Kleinstreuer, A. L. Karmaus, K. Mansouri, D. G. Allen, J. M. Fitzpatrick et al., Predictive models for acute oral systemic toxicity: A workshop to bridge the gap from research to regulation, Computational Toxicology, vol.8, pp.21-24, 2018.

K. Koch, Bayes' theorem, Bayesian Inference with Geodetic Applications, pp.4-8

. Springer, , 1990.

T. Kohonen, The self-organizing map, Proceedings of the IEEE, vol.78, issue.9, pp.1464-1480, 1990.

S. B. Kotsiantis, Supervised Machine Learning: A Review of Classification Techniques, Proceedings of the 2007 Conference on Emerging Artificial Intelligence Applications in Computer Engineering, pp.3-24, 2007.

D. Krewski, D. Acosta, M. Andersen, H. Anderson, J. Bailar et al., Toxicity Testing in the 21st Century: A Vision and A Strategy, Journal of toxicology and environmental health. Part B, Critical reviews, vol.13, pp.51-138, 2010.

M. Kubat, R. C. Holte, and S. Matwin, Machine learning for the detection of oil spills in satellite radar images. Machine learning, vol.30, pp.195-215, 1998.

L. Kuepfer, C. Niederalt, T. Wendl, J. Schlender, S. Willmann et al., Applied Concepts in PBPK Modeling: How to Build a PBPK/PD Model, CPT: Pharmacometrics & Systems Pharmacology, vol.5, issue.10, pp.516-531

M. Kuhn, The caret Package, 2009.

A. Lagunin, A. Zakharov, D. Filimonov, and V. Poroikov, QSAR Modelling of Rat Acute Toxicity on the Basis of PASS Prediction, Molecular Informatics, vol.30, issue.2-3, pp.241-250, 2011.

J. Lamb, E. D. Crawford, D. Peck, J. W. Modell, I. C. Blat et al.,

A. Brunet, K. N. Subramanian, M. Ross, H. Reich, G. Hieronymus et al., The Connectivity Map: Using Gene-Expression Signatures to, Genes, and Disease. Science, vol.313, issue.5795, pp.1929-1935, 2006.

I. A. Lea, H. Gong, A. Paleja, A. Rashid, and J. Fostel, CEBS: A comprehensive annotated database of toxicological data, Nucleic acids research, vol.45, pp.964-971, 2016.

Y. Lecun, Y. Bengio, and G. Hinton, Deep learning, Nature, vol.521, issue.7553, p.436, 2015.

Y. Lecun, L. Bottou, Y. Bengio, and P. Haffner, Gradient-based learning applied to document recognition, Proceedings of the IEEE, pp.2278-2324, 1998.

M. Leist, A. Ghallab, R. Graepel, R. Marchan, R. Hassan et al., Adverse outcome pathways: opportunities, limitations and open questions, Archives of Toxicology, vol.91, issue.11, pp.3477-3505, 2017.
URL : https://hal.archives-ouvertes.fr/hal-01968849

M. Leshno, V. Lin, A. Pinkus, and S. Schocken, Multilayer feedforward networks with a nonpolynomial activation function can approximate any function, Neural networks, vol.6, issue.6, pp.861-867, 1993.

T. Li, S. Zhu, and M. Ogihara, Using discriminant analysis for multi-class classification: an experimental investigation, Knowledge and Information Systems, vol.10, issue.4, pp.453-472, 2006.

X. Li, L. Chen, G. Cheng, F. Liu, X. Shen et al., In Silico Prediction of Chemical Acute Oral Toxicity Using Multi-Classification Methods, Journal of Chemical Information and Modeling, vol.54, issue.4, pp.1061-1069, 2014.

C. Lipinski and A. Hopkins, Navigating chemical space for biology and medicine, Nature, vol.432, issue.7019, pp.855-861, 2004.

J. Liu, K. Mansouri, R. S. Judson, M. T. Martin, H. Hong et al., Predicting Hepatotoxicity Using ToxCast in Vitro Bioactivity and Chemical Structure, Chemical Research Toxicology, vol.28, issue.4, pp.738-751, 2015.

J. Liu, G. Patlewicz, A. J. Williams, R. Thomas, and I. Shah, Predicting Organ Toxicity Using in Vitro Bioactivity Data and Chemical Structure, Chemical Research Toxicology, vol.30, issue.11, pp.2046-2059, 2017.

W. Loh, Classification and regression tree methods, Wiley StatsRef: Statistics Reference Online, 2008.

J. Louisse, K. Beekmann, and I. M. Rietjens, Use of Physiologically Based Kinetic Modeling-Based Reverse Dosimetry to Predict in Vivo Toxicity from in Vitro Data, Chemical Research in Toxicology, vol.30, issue.1, pp.114-125, 2017.

Y. Low, A. Sedykh, I. Rusyn, and A. Tropsha, Integrative Approaches for Predicting In Vivo Effects of Chemicals from their Structural Descriptors and the Results of Short-Term Biological Assays, Current Topics in Medicinal Chemistry, vol.14, issue.11, pp.1356-1364, 2014.

Y. Low, T. Uehara, Y. Minowa, H. Yamada, Y. Ohno et al., Predicting drug-induced hepatotoxicity using QSAR and toxicogenomics approaches, Chemical Research in Toxicology, 2011.

S. Ma and J. Huang, Penalized feature selection and classification in bioinformatics, Briefings in bioinformatics, vol.9, pp.392-403, 2008.

K. Madasamy and M. Ramaswami, Data Imbalance and Classifiers: Impact and Solutions from a Big Data Perspective, IJCIR, vol.13, issue.9, pp.2267-2281, 2017.

Y. Malgrange, Recherche des sous-matrices premières d'une matrice à coefficients binaires. applications à certains problèmes de graphe, Proceedings of the Deuxième Congrès de l'AFCALTI, pp.231-242, 1962.

C. A. Marchant, K. Briggs, and A. Long, Silico Tools for Sharing Data and Knowledge on Toxicity and Metabolism: Derek for Windows, Meteor, and Vitic. Toxicology mechanisms and methods, vol.18, pp.177-87, 2008.

E. Martin, P. Mukherjee, D. Sullivan, and J. Jansen, Profile-QSAR: A Novel meta-QSAR Method that Combines Activities across the Kinase Family To Accurately Predict Affinity, Selectivity, and Cellular Activity, Journal of Chemical Information and Modeling, vol.51, issue.8, pp.1942-1956, 2011.

E. J. Martin, V. R. Polyakov, L. Tian, and R. C. Perez, Profile-QSAR 2.0: Kinase Virtual Screening Accuracy Comparable to Four-Concentration IC50s for Realistically Novel Compounds, Journal of Chemical Information and Modeling, vol.57, issue.8, pp.2077-2088, 2017.

M. T. Martin, T. B. Knudsen, D. M. Reif, K. A. Houck, R. S. Judson et al.,

. Dix, Predictive model of rat reproductive toxicity from ToxCast high throughput screening, Biology of reproduction, vol.85, issue.2, pp.327-366, 2011.

C. J. Mattingly, G. T. Colby, J. N. Forrest, and J. L. Boyer, The Comparative Toxicogenomics Database (CTD), vol.111, pp.793-795, 2003.

A. Mayr, G. Klambauer, T. Unterthiner, and S. Hochreiter, DeepTox: Toxicity Prediction using Deep Learning, Frontiers in Environmental Science, vol.3, issue.80, 2016.

M. A. Mazurowski, P. A. Habas, J. M. Zurada, J. Y. Lo, J. A. Baker et al., Training neural network classifiers for medical decision making: The effects of imbalanced datasets on classification performance, Neural Networks, vol.21, issue.2-3, pp.427-436, 2008.

B. Meek and J. Doull, Pragmatic Challenges for the Vision of Toxicity Testing in the 21st Century in a Regulatory Context: Another Ames Test, Toxicological Sciences, vol.108, issue.1, pp.19-21, 2009.

B. Meek, C. M. Palermo, A. N. Bachman, C. M. North, and R. Lewis, Mode of action human relevance (species concordance) framework: Evolution of the Bradford Hill considerations and comparative analysis of weight of evidence, Journal of applied toxicology, vol.34, pp.595-606, 2014.

B. H. Menze, B. M. Kelm, R. Masuch, U. Himmelreich, P. Bachert et al., A comparison of random forest and its gini importance with standard chemometric methods for the feature selection and classification of spectral data, BMC Bioinformatics, vol.10, issue.1, p.213, 2009.

D. Meyer, Support Vector Machines. The Interface to libsvm in package e1071, 2001.
URL : https://hal.archives-ouvertes.fr/hal-00555258

J. B. Mitchell, Machine learning methods in chemoinformatics, Wiley Interdisciplinary Reviews: Computational Molecular Science, vol.4, issue.5, pp.468-481, 2014.

T. Netzeva, A. Worth, T. Aldenberg, R. Benigni, M. T. Cronin et al., Current status of methods for defining the applicability domain of (quantitative) structure-activity relationships, ATLA, vol.33, pp.155-173, 2005.

H. W. Ng, S. Doughty, H. Luo, H. Ye, W. Ge et al., Development and Validation of Decision Forest Model for Estrogen Receptor Binding Prediction of Chemicals Using Large Data Sets, Chemical Research in Toxicology, vol.28, issue.12, pp.2343-2351, 2015.

W. S. Noble, What is a Support Vector Machine?, Nature biotechnology, vol.24, pp.1565-1572, 2007.

U. Norinder and S. Boyer, Conformal prediction classification of a large data set of environmental chemicals from toxcast and tox21 estrogen receptor assays, Chemical research in toxicology, vol.29, issue.6, pp.1003-1010, 2016.

N. M. O'boyle, Towards a Universal SMILES representation -A standard method to generate canonical SMILES based on the InChI, Journal of cheminformatics, vol.4, p.22, 2012.

N. M. O'boyle, C. Morley, and G. R. Hutchison, Pybel: a Python wrapper for the Open-Babel cheminformatics toolkit, Chemistry Central Journal, vol.2, issue.1, 2008.

, Conceptual Framework for Testing and Assessment of Endocrine Disrupters, Official Journal of the European Union, 2002.

. Oecd and . Directive, /10/EC of the European Parliament and of the Council of 11 February 2004. The OECD Principles of Good Laboratory Practice (GLP), Official Journal of the European Union, 2004.

, OECD. Guidance Document on the Validation of (Quantitative) Structure-Activity Relationship [(Q)SAR] Models, p.154, 2014.

, Guidance Document for the use of Adverse Outcome Pathways in developing Integrated Approaches to Testing and Assessment (IATA), Series on Testing and Assessment, vol.260, 2017.

M. Omran, A. Engelbrecht, and A. Salman, An overview of clustering methods, vol.11, pp.583-605, 2007.

S. Palei and S. Das, Logistic regression model for prediction of roof fall risks in bord and pillar workings in coal mines: An approach, Safety Science -SAF SCI, vol.47, pp.88-96, 2009.

S. Parasuraman, Toxicological screening, Journal of Pharmacology & Pharmacotherapeutics, vol.2, pp.74-79, 2011.

G. Patlewicz, N. Jeliazkova, R. J. Safford, A. Worth, and B. Aleksiev, An evaluation of the implementation of the Cramer classification scheme in the Toxtree software, SAR and QSAR in environmental research, vol.19, pp.495-524, 2008.

S. Patro and K. K. Sahu, Normalization: A preprocessing stage, 2015.

F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion et al., Scikit-learn: Machine learning in Python, Journal of Machine Learning Research, vol.12, pp.2825-2830, 2011.
URL : https://hal.archives-ouvertes.fr/hal-00650905

R. Peeters, The maximum edge biclique problem is NP-complete, Discrete Applied Mathematics, vol.131, issue.3, pp.651-654, 2003.

P. Peres-neto, D. A. Jackson, and K. Somers, How Many Principal Components? Stopping Rules for Determining the Number of Non-Trivial Axes Revisited, Computational Statistics and Data Analysis, vol.49, pp.974-997, 2005.

D. Petrakis, L. Vassilopoulou, C. Mamoulakis, C. Psycharakis, A. Anifantaki et al., Endocrine Disruptors Leading to Obesity and Related Diseases, International Journal of Environmental Research and Public Health, vol.14, issue.10, p.1282, 2017.

L. M. Plunkett, M. A. Kaplan, and R. A. Becker, Challenges in using the ToxRefDB as a resource for toxicity prediction modeling, Regulatory Toxicology and Pharmacology, vol.72, issue.3, pp.610-614, 2015.

P. Pradeep, R. J. Povinelli, S. White, and S. J. Merrill, An ensemble model of QSAR tools for regulatory risk assessment, Journal of Cheminformatics, vol.8, issue.1, p.48, 2016.

. R-core-team, R: A Language and Environment for Statistical Computing, 2013.

A. B. Raies and V. B. Bajic, In silico toxicology: computational methods for the prediction of chemical toxicity, Wiley Interdisciplinary Reviews: Computational Molecular Science, vol.6, pp.147-172, 2016.

H. Raunio, Silico toxicology -non-testing methods, vol.2, p.33, 2011.

D. M. Reif, M. T. Martin, R. J. Kavlock, R. S. Judson, and D. J. Dix, Profiling Chemicals Based on Chronic Toxicity Results from the U.S. EPA ToxRef Database, Environmental Health Perspectives, vol.117, issue.3, pp.392-399, 2008.

K. Ribay, M. T. Kim, W. Wang, D. Pinolini, and H. Zhu, Predictive Modeling of Estrogen Receptor Binding Agents Using Advanced Cheminformatics Tools and Massive Public Data, Frontiers in Environmental Science, vol.4, p.12, 2016.

A. Richard, DSSTox web site launch: Improving public access to databases for building structure-toxicity prediction models, vol.2, pp.103-108, 2004.

A. M. Richard, R. S. Judson, K. A. Houck, C. M. Grulke, P. Volarath et al., ToxCast Chemical Landscape: Paving the Road to 21st Century Toxicology, Chemical Research in Toxicology, vol.29, issue.8, pp.1225-1251, 2016.

A. M. Richard, C. Yang, and R. S. Judson, Toxicity Data Informatics: Supporting a New Paradigm for Toxicity Prediction, Toxicology Mechanisms and Methods, vol.18, issue.2-3, pp.103-118, 2008.

S. Riniker, Y. Wang, J. L. Jenkins, and G. A. Landrum, Using Information from Historical High-Throughput Screens to Predict Active Compounds, Journal of Chemical Information and Modeling, vol.54, issue.7, pp.1880-1891, 2014.

D. Rogers and M. Hahn, Extended-Connectivity Fingerprints, Journal of Chemical Information and Modeling, vol.50, issue.5, pp.742-754, 2010.

F. P. Roth and J. Klekota, Chemical substructures that enrich for biological activity, Bioinformatics, vol.24, issue.21, pp.2518-2525, 2008.

D. Rouquie, M. Heneweer, J. Botham, H. Ketelslegers, L. K. Markell et al., Contribution of New Technologies to Characterisation and Prediction of Adverse Effects, Critical Reviews in Toxicology, vol.45, pp.1-12, 2015.

D. Rouquié, H. Tinwell, O. Blanck, F. Schorsch, D. Geter et al., Thyroid tumor formation in the male mouse induced by fluopyram is mediated by activation of hepatic CAR/PXR nuclear receptors, Regulatory Toxicology and Pharmacology, vol.70, issue.3, pp.673-680, 2014.

C. Rovida and T. Hartung, Re-evaluation of animal numbers and costs for in vivo tests to accomplish REACH legislation requirements for chemicals -a report by the transatlantic think tank for toxicology. Alternatives to animal experimentation, vol.26, pp.187-208, 2009.

W. M. Russell, R. L. Burch, and C. W. Hume, The principles of humane experimental technique, vol.238, 1959.

I. Rusyn, A. Sedykh, Y. Low, K. Z. Guyton, and A. Tropsha, Predictive Modeling of Chemical Hazard by Integrating Numerical Descriptors of Chemical Structures and Shortterm Toxicity Assay Data, Toxicological Sciences, vol.127, issue.1, pp.1-9, 2012.

N. Ryan, A user's guide for accessing and interpreting toxcast data

C. Rücker, G. Rücker, and M. Meringer, y-Randomization and Its Variants in

. Qspr/qsar, Journal of Chemical Information and Modeling, vol.47, issue.6, pp.2345-2357, 2007.

Y. Sakuratani, H. Q. Zhang, S. Nishikawa, K. Yamazaki, T. Yamada et al., Hazard Evaluation Support System (HESS) for predicting repeated dose toxicity using toxicological categories, journal = SAR and QSAR in Environmental Research, vol.24, issue.5, pp.351-363, 2013.

K. T. Savjani, A. K. Gajjar, and J. K. Savjani, Drug Solubility: Importance and Enhancement Techniques, ISRN Pharmaceutics, vol.2012, pp.1-10, 2012.

B. Scholkopf and A. Smola, Learning with kernels: support vector machines, regularization, optimization, and beyond, 2001.

T. T. Schug, A. F. Johnson, L. S. Birnbaum, T. Colborn, L. J. Guillette et al., Endocrine Disruptors: Past Lessons and Future Directions, vol.30, pp.833-847, 2016.

T. W. Schultz, T. I. Netzeva, and M. T. Cronin, Selection of data sets for qsars: Analyses of tetrahymena toxicity from aromatic compounds, SAR and QSAR in Environmental Research, vol.14, issue.1, pp.59-81, 2003.

A. Sedykh, H. Zhu, H. Tang, L. Zhang, A. Richard et al., Use of in Vitro HTS-Derived Concentration-Response Data as Biological Descriptors Improves the Accuracy of QSAR Models of in Vivo Toxicity, Environmental Health Perspectives, vol.119, issue.3, pp.364-370, 2011.

I. Shah, K. Houck, R. S. Judson, R. J. Kavlock, M. T. Martin et al., Using Nuclear Receptor Activity to Stratify Hepatocarcinogens, PLoS ONE, vol.6, issue.2, p.14584, 2011.

L. A. Shalabi and Z. Shaaban, Normalization as a Preprocessing Engine for Data Mining and the Approach of Preference Matrix, 2006 International Conference on Dependability of Computer Systems, pp.207-214, 2006.

J. Shen, L. Xu, H. Fang, A. M. Richard, J. D. Bray et al., EADB: An estrogenic activity database for assessing potential endocrine activity, Toxicological Sciences, pp.277-291, 2013.

R. P. Sheridan, Global Quantitative Structure-Activity Relationship Models vs Selected Local Models as Predictors of Off-Target Activities for Project Compounds, Journal of Chemical Information and Modeling, vol.54, issue.4, pp.1083-1092, 2014.

R. P. Sheridan, B. P. Feuston, V. N. Maiorov, and S. K. Kearsley, Similarity to molecules in the training set is a good discriminator for prediction accuracy in QSAR, J. Chem. Inf. Comput. Sci, vol.44, issue.6, pp.1912-1928, 2004.

B. Silverman, Density estimation for statistics and data analysis. Routledge, 2018.

N. S. Sipes, M. T. Martin, P. Kothiya, D. M. Reif, R. S. Judson et al., , p.331

, Enzymatic and Receptor Signaling Assays, Chemical Research in Toxicology, vol.26, issue.6, pp.878-895, 2013.

N. S. Sipes, M. T. Martin, D. M. Reif, N. C. Kleinstreuer, R. S. Judson et al., Predictive Models of Prenatal Developmental Toxicity from ToxCast High-Throughput Screening Data, Toxicological Sciences, vol.124, issue.1, pp.109-127, 2011.

J. Smalley, T. Gant, and S. Zhang, Application of connectivity mapping in predictive toxicology based on gene-expression similarity, Toxicology, vol.268, pp.143-146, 2009.

C. Sonich-mullin, R. Fielder, J. Wiltse, K. Baetcke, J. Dempsey et al.,

, IPCS Conceptual Framework for Evaluating a Mode of Action for Chemical Carcinogenesis, Regulatory Toxicology and Pharmacology, vol.34, issue.2, pp.146-152, 2001.

T. Steger-hartmann and F. Pognan, The eTOX Consortium: To Improve the Safety Assessment of New Drug Candidates, Pharmazeutische Medizin, vol.1, pp.3-13, 2017.

T. Steger-hartmann, F. Pognan, F. Sanz, C. Diaz, A. Sutter et al., In silico prediction of in vivo toxicity -the first steps of the eTox consortium, Toxicology Letters, vol.196, pp.250-251, 2010.

C. Strobl, A. Boulesteix, T. Kneib, T. Augustin, and A. Zeileis, Conditional Variable Importance for Random Forests, BMC Bioinformatics, vol.9, issue.1, p.307, 2008.

A. Subramanian, R. Narayan, S. M. Corsello, D. D. Peck, T. E. Natoli et al.,

A. A. Davis and J. K. Tubelli, A Next Generation Connectivity Map: L1000 Platform and the First 1,000,000 Profiles, Cell, vol.171, issue.6, pp.1437-1452, 2017.

L. Sun, H. Yang, Y. Cai, W. Li, G. Liu et al., Silico Prediction of Endocrine Disrupting Chemicals Using Single-Label and Multilabel Models, vol.59, pp.973-982, 2019.

R. Sutton and A. Barto, Introduction to reinforcement learning, vol.135, 1998.

F. Svensson, U. Norinder, and A. Bender, Modelling compound cytotoxicity using conformal prediction and PubChem HTS data, Toxicology Research, vol.6, issue.1, pp.73-80, 2017.

V. Svetnik, A. Liaw, C. Tong, J. C. Culberson, R. P. Sheridan et al., Random Forest: A Classification and Regression Tool for Compound Classification and QSAR Modeling, Journal of Chemical Information and Computer Sciences, vol.43, issue.6, pp.1947-1958, 2003.

O. Taboureau and K. Audouze, Human environmental disease network: A computational model to assess toxicology of contaminants. Alternatives to animal experimentation, vol.34, pp.289-300, 2017.

J. Tang, S. Alelyani, and H. Liu, Feature selection for classification: A review, Data Classification: Algorithms and Applications, vol.01, pp.37-64, 2014.

O. Tcheremenskaia, R. Benigni, I. Nikolova, N. Jeliazkova, S. Escher et al., OpenTox predictive toxicology framework: Toxicological ontology and semantic media wiki-based OpenToxipedia, Journal of biomedical semantics, vol.3, issue.1, p.7, 2012.

A. Teasdale, ICH M7, chapter 24, pp.667-699, 2017.

I. V. Tetko, V. Y. Tanchuk, T. N. Kasheva, and A. E. Villa, Estimation of Aqueous Solubility of Chemical Compounds Using E-State Indices, Journal for Chemical Information and Computer Scientists, vol.41, issue.6, pp.1488-1493, 2001.

R. Thomas, The US Federal Tox21 Program: A Strategic and Operational Plan for Continued Leadership. Alternatives to animal experimentation, vol.35, pp.163-168, 2018.

R. S. Thomas, M. B. Black, L. Li, E. Healy, T. Chu et al., A Comprehensive Statistical Analysis of Predicting In Vivo Hazard Using High-Throughput In Vitro Screening, Toxicological Sciences, vol.128, issue.2, pp.398-417, 2012.

R. R. Tice, C. P. Austin, R. J. Kavlock, and J. R. Bucher, Improving the Human Hazard Characterization of Chemicals: A Tox21 Update, Environmental Health Perspectives, vol.121, issue.7, pp.756-765, 2013.

J. G. Topliss and R. J. Costello, Chance correlations in structure-activity studies using multiple regression analysis, Journal of Medicinal Chemistry, vol.15, issue.10, pp.1066-1068, 1972.

A. Tropsha, Best Practices for QSAR Model Development, Validation, and Exploitation, Molecular Informatics, vol.29, issue.6-7, pp.476-488, 2010.

T. Uehara, A. Ono, T. Maruyama, I. Kato, H. Yamada et al., The Japanese toxicogenomics project: Application of toxicogenomics, Molecular Nutrition & Food Research, vol.54, issue.2, pp.218-227

L. Van-der-maaten, E. Postma, J. Van-den, and . Herik, Dimensionality reduction: a comparative, Journal of Machine Learning Research, vol.10, pp.66-71, 2009.

L. N. Vandenberg, R. T. Zoeller, W. V. Welshons, J. P. Myers, T. Colborn et al.,

D. R. Heindel, D. Jacobs, A. M. Lee, T. Shioda, and F. S. Soto, Hormones and Endocrine-Disrupting Chemicals: Low-Dose Effects and Nonmonotonic Dose Responses, Endocrine Reviews, vol.33, issue.3, pp.378-455, 2012.

W. N. Venables and B. D. Ripley, Modern Applied Statistics with S, 2002.

D. L. Villeneuve, D. Crump, N. Garcia-reyero, M. Hecker, T. H. Hutchinson et al., Adverse Outcome Pathway (AOP) Development I: Strategies and Principles, Toxicological Sciences, vol.142, issue.2, pp.312-320, 2014.

Y. Wang, S. H. Bryant, T. Cheng, J. Wang, A. Gindulyte et al., PubChem BioAssay: 2017 update, Nucleic Acids Research, vol.45, issue.D1, pp.955-963, 2017.

Y. Wang, J. Xiao, T. O. Suzek, J. Zhang, J. Wang et al., PubChem: a public information system for analyzing bioactivities of small molecules, Nucleic Acids Research, pp.623-633, 2009.

W. Warr, Representation of chemical structures, Wiley Interdisciplinary Reviews: Computational Molecular Science, vol.1, pp.557-579, 2011.

M. Waters, S. Stasiewicz, B. Merrick, K. Tomer, P. Bushel et al., CEBS Chemical Effects in Biological Systems: a public data repository integrating study design and toxicity data with microarray and proteomics data, Nucleic Acids Research, vol.36, pp.892-900, 2007.

S. Watford, A. Adrian, J. Wignall, J. Brown, and M. Martin, ToxRefDB 2.0: Improvements in Capturing Qualitative and Quantitative Data from in vivo Toxicity Studies, 2017.

D. Weininger, SMILES, a chemical language and information system. 1. Introduction to methodology and encoding rules, Journal of Chemical Information and Computer Sciences, vol.28, issue.1, pp.31-36, 1988.

J. Wenzel, H. Matter, and F. Schmidt, Predictive Multitask Deep Neural Network Models for ADME-Tox Properties: Learning from Large Data Sets, Journal of Chemical Information and Modeling, vol.59, issue.3, pp.1253-1268, 2019.

C. Wittwehr, S. Munn, B. Landesmann, and M. Whelan, Adverse Outcome Pathways Knowledge Base (AOP-KB), Toxicology Letters, vol.238, p.309, 2015.

S. Wold, K. Esbensen, and P. Geladi, Principal component analysis. Chemometrics and Intelligent Laboratory Systems, vol.2, pp.37-52, 1987.

D. C. Wolf, .. M. Cohen, A. R. Boobis, V. L. Dellarco, P. A. Fenner-crisp et al., Chemical carcinogenicity revisited 1: A unified theory of carcinogenicity based on contemporary knowledge, Regulatory Toxicology and Pharmacology, vol.103, pp.86-92, 2019.

D. H. Wolpert, Stacked generalization, Neural Networks, vol.5, issue.2, pp.241-259, 1992.

D. H. Wolpert, The Lack of A Priori Distinctions Between Learning Algorithms, Neural Computation, vol.8, issue.7, pp.1341-1390, 1996.

A. Worth, The Role of QSAR Methodology in the Regulatory Assessment of Chemicals, pp.367-382, 2010.

L. Wu, Z. Liu, S. Auerbach, R. Huang, M. Chen et al., Integrating Drug's Mode of Action into Quantitative Structure-Activity Relationships for Improved Prediction of Drug-Induced Liver Injury, Journal of Chemical Information and Modeling, vol.57, issue.4, pp.1000-1006, 2017.

Q. Xu and Y. Liang, Monte Carlo Cross Validation. Chemometrics and Intelligent Laboratory Systems, vol.56, pp.1-11, 2001.

L. Yang, J. Rathman, C. Yang, K. Arvidson, M. Cronin et al., Development of a COSMOS DB to support in silico modelling for cosmetics ingredients and related chemicals. The Toxicologist -A Supplement to Toxicological Sciences, pp.132-185, 2013.

P. Yang, Y. Yang, B. Zhou, and A. Zomaya, A Review of Ensemble Methods in Bioinformatics. Current Bioinformatics, vol.5, issue.4, pp.296-308, 2010.

C. W. Yap, PaDEL-descriptor: An open source software to calculate molecular descriptors and fingerprints, Journal of Computational Chemistry, vol.32, issue.7, pp.1466-1474, 2011.

P. Yazgana and A. Kusakci, A Literature Survey on Association Rule Mining Algorithms, Southeast Europe Journal of Soft Computing, vol.5, p.1859, 2016.

S. Yen and Y. Lee, Cluster-based Under-sampling Approaches for Imbalanced Data Distributions, Expert Systems with Applications, vol.36, pp.5718-5727, 2006.

Y. Yin and E. Gelenbe, Single-cell based random neural network for deep learning, 2017 International Joint Conference on Neural Networks (IJCNN), pp.86-93, 2017.

C. Ying, M. Qi-guang, L. Jia-chen, and G. Lin, Advance and prospects of AdaBoost algorithm, Acta Automatica Sinica, vol.39, issue.6, pp.745-758, 2013.

Q. Zang, D. M. Rotroff, and R. S. Judson, Binary Classification of a Large Collection of Environmental Chemicals from Estrogen Receptor Assays by Quantitative Structure-Activity Relationship and Machine Learning Methods, Journal of Chemical Information and Modeling, vol.53, issue.12, pp.3244-3261, 2013.

C. Zhang, F. Cheng, W. Li, G. Liu, P. W. Lee et al., In silico Prediction of Drug Induced Liver Toxicity Using Substructure Pattern Recognition Method, Molecular Informatics, vol.35, issue.3-4, pp.136-144, 2016.

S. Zhang, C. Zhang, and Q. Yang, Data Preparation for Data Mining, Applied Artificial Intelligence, vol.17, pp.375-381, 2003.

Y. Zhang, C. A. Phillips, G. L. Rogers, E. J. Baker, E. J. Chesler et al., On finding bicliques in bipartite graphs: a novel algorithm and its application to the integration of diverse biological data types, BMC Bioinformatics, vol.15, issue.1, p.110, 2014.

Y. Zhang, Y. Yin, D. Guo, X. Yu, and L. Xiao, Cross-validation based weights and structure determination of Chebyshev-polynomial neural networks for pattern classification, Pattern Recognition, vol.47, issue.10, pp.3414-3428, 2014.

H. Zhu, L. Ye, A. Richard, A. Golbraikh, F. Wright et al., A Novel Two-Step Hierarchical Quantitative Structure-Activity Relationship Modeling Work Flow for Predicting Acute Toxicity of Chemicals in Rodents, Environmental health perspectives, vol.117, pp.1257-1264, 2009.

H. Zhu, J. Zhang, M. T. Kim, A. Boison, A. Sedykh et al., Big Data in Chemical Toxicity Research: The Use of High-Throughput Screening Assays To Identify Potential Toxicants, Chemical Research in Toxicology, vol.27, issue.10, pp.1643-1651, 2014.

X. Zhu, A. Sedykh, and S. Liu, Hybrid in silico models for drug-induced liver injury using chemical descriptors and in vitro cell-imaging information, Journal of applied toxicology, vol.34, pp.281-288, 2014.

J. Ziang, KNN approach to unbalanced data distributions: a case study involving information extraction, Proceedings of the International Conference on Machine Learning, vol.126, 2003.