A. Abdellaoui, Population structure, migration, and diversifying selection in the Netherlands, Eur. J. Hum. Genet, vol.21, pp.1277-1285, 2013.

G. Abraham and M. Inouye, Fast principal component analysis of large-scale genome-wide data, PLoS ONE, vol.9, p.93766, 2014.

G. Abraham, SparSNP: fast and memory-efficient analysis of all SNPs for phenotype prediction, BMC Bioinformatics, p.88, 2012.

G. Abraham, FlashPCA2: principal component analysis of biobank-scale genotype datasets, bioRxiv, vol.12, pp.2014-2017, 2016.

Y. S. Aulchenko, Genabel: an r library for genome-wide association analysis, Bioinformatics, vol.23, pp.1294-1296, 2007.

S. R. Browning and B. L. Browning, Rapid and accurate haplotype phasing and missing data inference for whole genome association studies by use of localized haplotype clustering, Am. J. Hum. Genet, vol.81, pp.1084-1097, 2007.

C. C. Chang, Second-generation plink: rising to the challenge of larger and richer datasets, 2015.

N. Chatterjee, Projecting the performance of risk prediction based on polygenic analyses of genome-wide association studies, Nature Genet, vol.45, pp.405-406, 2013.

T. Chen and C. Guestrin, XGBoost: A scalable tree boosting system, Proceedings of the 22nd ACM Sigkdd International Conference on Knowledge Discovery and Data Mining, pp.785-794, 2016.

P. C. Dubois, Multiple common variants for celiac disease influencing immune gene expression, Nature Genet, vol.42, pp.295-302, 2010.

F. Dudbridge, Power and predictive accuracy of polygenic risk scores, 2013.

, PLoS Genet, vol.9, p.1003348

D. Eddelbuettel and R. Franc¸ois, Rcpp: seamless R and C þþ integration, J. Stat. Softw, vol.40, pp.1-18, 2011.

J. Euesden, PRSice: Polygenic Risk Score software, Bioinformatics, vol.31, pp.1466-1468, 2015.

J. Friedman, Regularization paths for generalized linear models via coordinate descent, J. Stat. Softw, vol.33, pp.1-22, 2010.

K. J. Galinsky, Fast principal-component analysis reveals convergent evolution of ADH1B in Europe and East Asia, Am. J. Hum. Genet, vol.98, pp.456-472, 2016.

S. M. Gogarten, GWASTools: an R/Bioconductor package for quality control and analysis of genome-wide association studies, Bioinformatics, vol.28, pp.3329-3331, 2012.

J. J. Hayward, Complex disease and phenotype mapping in the domestic dog, Nat. Commun, vol.7, p.10460, 2016.

M. J. Kane, Scalable strategies for computing with massive data, J. Stat. Softw, vol.55, pp.1-19, 2013.

R. B. Lehoucq and D. C. Sorensen, Deflation techniques for an implicitly restarted Arnoldi iteration, SIAM J. Matrix Anal. Appl, vol.17, pp.789-821, 1996.

K. Luu, pcadapt: an R package to perform genome scans for selection based on principal component analysis, Mol. Ecol. Resour, vol.17, pp.67-77, 2017.
URL : https://hal.archives-ouvertes.fr/hal-01430346

S. Mccarthy, A reference panel of 64, 976 haplotypes for genotype imputation, Nature Genet, vol.48, p.1279, 2016.

M. R. Nelson, The population reference sample, POPRES: a resource for population, disease, and pharmacological genetics research, Am. J. Hum. Genet, vol.83, pp.347-358, 2008.

J. Nielsen and T. Mailund, SNPFile-a software library and file format for large scale association mapping and population genetics studies, BMC Bioinformatics, vol.9, p.526, 2008.

C. Palmer and I. Pe'er, Bias characterization in probabilistic genotype data and improved signal detection with multiple imputation, PLoS Genet, vol.12, p.1006091, 2016.

A. L. Price, Principal components analysis corrects for stratification in genome-wide association studies, Nature Genet, vol.38, pp.904-909, 2006.

A. L. Price, Long-range LD can confound genome scans in admixed populations, Am. J. Hum. Genet, vol.83, pp.132-135, 2008.

S. Purcell, PLINK: a tool set for whole-genome association and population-based linkage analyses, Am. J. Hum. Genet, vol.81, pp.559-575, 2007.

Y. Qiu and J. Mei, RSpectra: Solvers for Large Scale Eigenvalue and SVD Problems, 2016.

. R-core-team, R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, 2017.

K. Sikorska, Gwas on your notebook: fast semi-parallel linear and logistic regression for genome-wide association studies, BMC Bioinformatics, vol.14, pp.267-288, 1996.

R. Tibshirani, Strong rules for discarding predictors in lasso-type problems, J. R. Stat. Soc. Series B Stat. Methodol, vol.74, pp.245-266, 2012.

Y. Wang, Fast accurate missing SNP genotype local imputation, BMC Res. Notes, vol.5, p.404, 2012.

Y. Zeng and P. Breheny, The biglasso package: a memory-and computation-efficient solver for lasso model fitting with big data in R, 2017.

G. Zheng, Analysis of Genetic Association Studies. Statistics for Biology and Health, 2012.

X. Zheng, A high-performance computing toolset for relatedness and principal component analysis of snp data, Bioinformatics, vol.28, pp.3326-3328, 2012.

H. Zou and T. Hastie, Regularization and variable selection via the elastic net, J. R. Stat. Soc. Series B Stat. Methodol, vol.67, pp.301-320, 2005.

G. Chen, T. Guestrin]-chen, and C. Guestrin, XGBoost : Reliable Large-scale Tree Boosting System. arXiv, pp.1-6, 2016.

[. Price, Long-Range LD Can Confound Genome Scans in Admixed Populations, 2008.

G. Abraham, A. Kowalczyk, J. Zobel, and M. Inouye, Performance and robustness of penalized and unpenalized methods for genetic prediction of complex human disease, Genet. Epidemiol, vol.37, pp.184-195, 2013.

G. Abraham, J. A. Tye-din, O. G. Bhalala, A. Kowalczyk, and J. Zobel, Accurate and robust genomic prediction of celiac disease using statistical learning, PLoS Genet, vol.10, p.1004137, 2014.

V. Botta, G. Louppe, P. Geurts, and L. Wehenkel, Exploiting SNP correlations within random forest for genome-wide association studies, PLoS One, vol.9, 2014.

L. Breiman, Random forests, Mach. Learn, vol.45, pp.5-32, 2001.

C. Bycroft, C. Freeman, D. Petkova, G. Band, and L. T. Elliott, The UK biobank resource with deep phenotyping and genomic data, Nature, vol.562, pp.203-209, 2018.

N. Chatterjee, B. Wheeler, J. Sampson, P. Hartge, and S. J. Chanock, Projecting the performance of risk prediction based on polygenic analyses of genome-wide association studies, Nat. Genet, vol.45, pp.400-405, 2013.

N. Chatterjee, J. Shi, and M. García-closas, Developing and evaluating polygenic risk prediction models for stratified disease prevention, Nat. Rev. Genet, vol.17, pp.392-406, 2016.

S. Dey, R. Gupta, M. Steinbach, and V. Kumar, Integration of clinical and genomic data: a methodological survey, 2013.

L. E. Dodd and M. S. Pepe, Partial AUC estimation and regression, Biometrics, vol.59, pp.614-623, 2003.

P. C. Dubois, G. Trynka, L. Franke, K. A. Hunt, and J. Romanos, Multiple common variants for celiac disease influencing immune gene expression, Nat. Genet, vol.42, p.465, 2010.

F. Dudbridge, Power and predictive accuracy of polygenic risk scores, PLoS Genet, vol.9, p.1003348, 2013.

,

D. M. Evans, P. M. Visscher, and N. R. Wray, Harnessing the information contained within genome-wide association studies to improve individual prediction of complex disease risk, Hum. Mol. Genet, vol.18, pp.3525-3531, 2009.

D. S. Falconer, The inheritance of liability to certain diseases, estimated from the incidence among relatives, Pattern Recognit. Lett, vol.29, pp.861-874, 1965.

J. Friedman, T. Hastie, and R. Tibshirani, Regularization paths for generalized linear models via coordinate descent, J. Stat. Softw, vol.33, issue.1, 2010.

T. Hastie, R. Tibshirani, and J. Friedman, Model assessment and selection, The Elements of Statistical Learning, pp.219-259, 2008.

A. E. Hoerl and R. W. Kennard, Ridge regression: biased estimation for nonorthogonal problems, Technometrics, vol.12, pp.55-67, 1970.

A. C. Janssens, R. Moonesinghe, Q. Yang, E. W. Steyerberg, and C. M. Van-duijn, The impact of genotype frequencies on the clinical validity of genomic profiling for predicting common chronic diseases, Genet. Med, vol.9, pp.528-535, 2007.

L. Lello, S. G. Avery, L. Tellier, A. I. Vazquez, and G. De-los-campos, Accurate genomic prediction of human height, Genetics, vol.210, pp.477-497, 2018.

L. B. Lusted, Signal detectability and medical decision-making, Science, vol.171, pp.1217-1219, 1971.

C. Márquez-luna, P. Loh, and A. L. Price, Multiethnic polygenic risk scores improve risk prediction in diverse populations, Genet. Epidemiol, vol.41, pp.811-823, 2017.

A. R. M-a-r-t-i-n, C. R. G-i-g-n-o-u-x, R. K. W-a-l-t-e-r-s, G. L. W-o-j-c-i-k, and B. M. Neale, Human demographic history impacts genetic risk prediction across diverse populations, Am. J. Hum. Genet, vol.100, pp.635-649, 2017.

N. Mavaddat, K. Michailidou, J. Dennis, M. Lush, and L. , Polygenic risk scores for prediction of breast cancer and breast cancer subtypes, Am. J. Hum. Genet, vol.104, pp.21-34, 2019.

D. K. Mcclish, Analyzing a portion of the roc curve, Med. Decis. Making, vol.9, pp.190-195, 1989.

S. Okser, T. Pahikkala, A. Airola, T. Salakoski, and S. Ripatti, Regularized machine learning in the genetic prediction of complex traits, PLoS Genet, vol.10, 2014.

N. Pashayan, S. W. Duffy, D. E. Neal, F. C. Hamdy, and J. L. Donovan, Implications of polygenic risk-stratified screening for prostate cancer on overdiagnosis, Genet. Med, vol.17, pp.789-795, 2015.

F. Privé, H. Aschard, A. Ziyatdinov, and M. G. Blum, Efficient analysis of large-scale genome-wide data with two R packages: bigstatsr and bigsnpr, Bioinformatics, vol.34, pp.2781-2787, 2018.

S. M. Purcell, N. R. Wray, J. L. Stone, P. M. Visscher, and M. C. O'donovan, Common polygenic variation contributes to risk of schizophrenia and bipolar disorder, Nature, vol.460, pp.748-752, 2009.

R. Tibshirani, Regression shrinkage and selection via the lasso, J. R. Stat. Soc. B, vol.58, pp.267-288, 1996.

R. Tibshirani, J. Bien, J. Friedman, T. Hastie, and N. Simon, Integration of clinical and gene expression data has a synergetic effect on predicting breast cancer outcome, J. R. Stat. Soc. Series B Stat. Methodol, vol.74, pp.245-266, 2012.

B. J. Vilhjálmsson, J. Yang, H. K. Finucane, A. Gusev, and S. Lindström, Modeling linkage disequilibrium increases accuracy of polygenic risk scores, Am. J. Hum. Genet, vol.97, pp.576-592, 2015.

E. B. Ware, L. L. Schmitz, J. D. Faul, A. Gard, and C. Mitchell, Heterogeneity in polygenic scores for common human traits, vol.106062, 2017.

Z. Wei, K. Wang, H. Qu, H. Zhang, and J. Bradfield, From disease association to risk assessment: an optimistic view from genome-wide association studies on type 1 diabetes, PLoS Genet, vol.5, 2009.

Z. Wei, W. Wang, J. Bradfield, J. Li, and C. Cardinale, Large sample size, wide variant spectrum, and advanced machine-learning technique boost risk prediction for inflammatory bowel disease, Am. J. Hum. Genet, vol.92, pp.1008-1012, 2013.

, Efficient Penalized Regression for PRS 73

R. Sachs and M. C. , plotroc: A tool for plotting roc curves, Journal of Statistical Software, issue.c02, p.79, 2017.

N. R. Wray, J. Yang, M. E. Goddard, P. M. Visscher, S. Selzam et al., The genetic interpretation of area under the roc curve in genomic profiling, e1000864. References Allegrini, vol.6, p.1, 2010.

L. Breiman, Stacked regressions. Machine learning, vol.24, pp.49-64, 1996.

A. Buniello, J. A. Macarthur, M. Cerezo, L. W. Harris, J. Hayhurst et al., The NHGRI-EBI GWAS Catalog of published genomewide association studies, targeted arrays and summary statistics 2019, Nucleic acids research, vol.47, issue.D1, pp.1005-1012, 2018.

C. Bycroft, C. Freeman, D. Petkova, G. Band, L. T. Elliott et al., The uk biobank resource with deep phenotyping and genomic data, Nature, vol.562, issue.7726, p.203, 2018.

J. Censin, C. Nowak, N. Cooper, P. Bergsten, J. A. Todd et al., Childhood adiposity and risk of type 1 diabetes: A mendelian randomization study, PLoS medicine, vol.14, issue.8, p.1002362, 2017.

C. C. Chang, C. C. Chow, L. C. Tellier, S. Vattikuti, S. M. Purcell et al., Second-generation PLINK: rising to the challenge of larger and richer datasets, Gigascience, vol.4, issue.1, p.7, 2015.

N. Chatterjee, J. Shi, and M. García-closas, Developing and evaluating polygenic risk prediction models for stratified disease prevention, Nature Reviews Genetics, vol.17, issue.7, p.392, 2016.

S. Chun, M. Imakaev, D. Hui, N. A. Patsopoulos, B. M. Neale et al., Non-parametric polygenic risk prediction using partitioned gwas summary statistics, BioRxiv, p.370064, 2019.

W. Chung, J. Chen, C. Turman, S. Lindstrom, Z. Zhu et al., Efficient cross-trait penalized regression increases prediction accuracy in large cohorts using secondary phenotypes, Nature communications, vol.10, issue.1, p.569, 2019.

D. R. Cox, Regression models and life-tables, Journal of the Royal Statistical Society: Series B (Methodological), vol.34, issue.2, pp.187-202, 1972.

F. Demenais, P. Margaritte-jeannin, K. C. Barnes, W. O. Cookson, J. Altmüller et al., Multiancestry association study identifies new asthma risk loci that colocalize with immune-cell enhancer marks, Nature genetics, vol.50, issue.1, p.42, 2018.

F. Dudbridge, Power and predictive accuracy of polygenic risk scores, PLoS genetics, vol.9, issue.3, p.1003348, 2013.

J. Euesden, C. M. Lewis, and P. F. O?eilly, PRSice: polygenic risk score software, Bioinformatics, vol.31, issue.9, pp.1466-1468, 2014.

D. S. Falconer, The inheritance of liability to certain diseases, estimated from the incidence among relatives, Annals of human genetics, vol.29, issue.1, pp.51-76, 1965.

T. Ge, C. Chen, Y. Ni, Y. A. Feng, and J. W. Smoller, Polygenic prediction via bayesian regression and continuous shrinkage priors. bioRxiv, p.416859, 2019.

J. J. Hughey, S. D. Rhoades, D. Y. Fu, L. Bastarache, J. C. Denny et al., Cox regression increases power to detect genotype-phenotype associations in genomic studies using the electronic health record, BioRxiv, p.599910, 2019.

M. Inouye, G. Abraham, C. P. Nelson, A. M. Wood, M. J. Sweeting et al., Genomic risk prediction of coronary artery disease in 480,000 adults: implications for primary prevention, Journal of the American College of Cardiology, vol.72, issue.16, pp.1883-1893, 2018.

E. Krapohl, H. Patel, S. Newhouse, C. J. Curtis, S. Von-stumm et al., Multi-polygenic score approach to trait prediction, Molecular psychiatry, vol.23, issue.5, p.1368, 2018.

J. J. Lee, R. Wedow, A. Okbay, E. Kong, O. Maghzian et al., Gene discovery and polygenic prediction from a genome-wide association study of educational attainment in 1.1 million individuals, Nature genetics, vol.50, issue.8, p.1112, 2018.

S. H. Lee, M. E. Goddard, N. R. Wray, and P. M. Visscher, A better coefficient of determination for genetic profile analysis, Genetic epidemiology, vol.36, issue.3, pp.214-224, 2012.

L. R. Lloyd-jones, J. Zeng, J. Sidorenko, L. Yengo, G. Moser et al., Improved polygenic prediction by bayesian multiple regression on summary statistics, p.522961, 2019.

R. M. Maier, Z. Zhu, S. H. Lee, M. Trzaskowski, D. M. Ruderfer et al., Improving genetic prediction by leveraging genetic correlations among human diseases and traits, Nature communications, vol.9, issue.1, p.989, 2018.

T. S. Mak, R. M. Porsch, S. W. Choi, X. Zhou, and P. C. Sham, Polygenic scores via penalized regression on summary statistics, Genetic epidemiology, issue.6, pp.469-480, 2017.

K. Michailidou, S. Lindström, J. Dennis, J. Beesley, S. Hui et al., Association analysis identifies 65 new breast cancer risk loci, Nature, vol.551, issue.7678, p.92, 2017.

M. Nikpay, A. Goel, H. Won, L. M. Hall, C. Willenborg et al., A comprehensive 1000 genomes-based genome-wide association metaanalysis of coronary artery disease, Nature genetics, vol.47, issue.10, p.1121, 2015.

Y. Okada, D. Wu, G. Trynka, T. Raj, C. Terao et al., Genetics of rheumatoid arthritis contributes to biology and drug discovery, Nature, vol.506, issue.7488, p.376, 2014.

J. K. Pritchard and M. Przeworski, Linkage disequilibrium in humans: models and data, The American Journal of Human Genetics, vol.69, issue.1, pp.1-14, 2001.

F. Privé, H. Aschard, A. Ziyatdinov, and M. G. Blum, Efficient analysis of large-scale genomewide data with two R packages: bigstatsr and bigsnpr, Bioinformatics, vol.34, issue.16, pp.2781-2787, 2018.

F. Privé, H. Aschard, and M. G. Blum, Efficient implementation of penalized regression for genetic risk prediction, Genetics, p.302019, 2019.

S. Purcell, B. Neale, K. Todd-brown, L. Thomas, M. A. Ferreira et al., Plink: a tool set for whole-genome association and population-based linkage analyses, The American journal of human genetics, vol.81, issue.3, pp.559-575, 2007.

S. M. Purcell, N. R. Wray, J. L. Stone, P. M. Visscher, M. C. O'donovan et al., Common polygenic variation contributes to risk of schizophrenia and bipolar disorder, Nature, vol.460, issue.7256, pp.748-752, 2009.

. R-core-team, R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, 2018.

F. R. Schumacher, A. A. Olama, S. I. Berndt, S. Benlloch, M. Ahmed et al., Association analyses of more than 140,000 men identify 63 new prostate cancer susceptibility loci, Nature genetics, vol.50, issue.7, p.928, 2018.

R. A. Scott, L. J. Scott, R. Mägi, L. Marullo, K. J. Gaulton et al., An expanded genome-wide association study of type 2 diabetes in europeans, Diabetes, vol.66, issue.11, pp.2888-2902, 2017.
URL : https://hal.archives-ouvertes.fr/hal-01832132

R. Tibshirani, J. Bien, J. Friedman, T. Hastie, N. Simon et al., Strong rules for discarding predictors in lasso-type problems, Journal of the Royal Statistical Society: Series B (Statistical Methodology), vol.74, issue.2, pp.245-266, 2012.

B. J. Vilhjálmsson, J. Yang, H. K. Finucane, A. Gusev, S. Lindström et al., Modeling linkage disequilibrium increases accuracy of polygenic risk scores, The American Journal of Human Genetics, vol.97, issue.4, pp.576-592, 2015.

N. R. Wray, M. E. Goddard, and P. M. Visscher, Prediction of individual genetic risk to disease from genome-wide association studies, Genome research, vol.17, issue.10, pp.1520-1528, 2007.

N. R. Wray, J. Yang, M. E. Goddard, and P. M. Visscher, The genetic interpretation of area under the roc curve in genomic profiling, PLoS genetics, vol.6, issue.2, p.1000864, 2010.

N. R. Wray, J. Yang, B. J. Hayes, A. L. Price, M. E. Goddard et al., Pitfalls of predicting complex traits from snps, Nature Reviews Genetics, vol.14, issue.7, p.507, 2013.

N. R. Wray, S. H. Lee, D. Mehta, A. A. Vinkhuyzen, F. Dudbridge et al., Research review: polygenic methods and their application to psychiatric traits, Journal of Child Psychology, 2014.

G. Abraham and M. Inouye, Genomic risk prediction of complex human disease and its clinical application. Current opinion in genetics & development, vol.33, pp.10-16, 2015.

G. Abraham, A. Kowalczyk, J. Zobel, and M. Inouye, Sparsnp: Fast and memory-efficient analysis of all snps for phenotype prediction, BMC bioinformatics, vol.13, issue.1, p.88, 2012.

G. Abraham, A. Kowalczyk, J. Zobel, and M. Inouye, Performance and robustness of penalized and unpenalized methods for genetic prediction of complex human disease, Genetic Epidemiology, vol.37, issue.2, pp.184-195, 2013.

G. Abraham, J. A. Tye-din, O. G. Bhalala, A. Kowalczyk, J. Zobel et al., Accurate and robust genomic prediction of celiac disease using statistical learning, PLoS genetics, vol.10, issue.2, p.1004137, 2014.

A. Breast-cancer-study and . Group, Prevalence and penetrance of brca1 and brca2 mutations in a population-based series of breast cancer cases, British Journal of Cancer, vol.83, issue.10, p.1301, 2000.

H. Aschard, J. Chen, M. C. Cornelis, L. B. Chibnik, E. W. Karlson et al., Inclusion of gene-gene and gene-environment interactions unlikely to dramatically improve risk prediction for complex diseases, The American Journal of Human Genetics, vol.90, issue.6, pp.962-972, 2012.

W. Astle and D. J. Balding, Population structure and cryptic relatedness in genetic association studies, Statistical Science, vol.24, issue.4, pp.451-471, 2009.

P. L. Auer and G. Lettre, Rare variant association studies: considerations, challenges and opportunities, Genome medicine, vol.7, issue.1, p.16, 2015.

F. Bäckhed, H. Ding, T. Wang, L. V. Hooper, G. Y. Koh et al., The gut microbiota as an environmental factor that regulates fat storage, Proceedings of the National Academy of Sciences, vol.101, issue.44, pp.15718-15723, 2004.

V. Botta, G. Louppe, P. Geurts, and L. Wehenkel, Exploiting snp correlations within random forest for genome-wide association studies, PloS one, vol.9, issue.4, p.93379, 2014.

L. Breiman, Stacked regressions. Machine learning, vol.24, pp.49-64, 1996.

B. K. Bulik-sullivan, P. Loh, H. K. Finucane, S. Ripke, J. Yang et al., Ld score regression distinguishes confounding from polygenicity in genome-wide association studies, Nature genetics, vol.47, issue.3, p.291, 2015.

C. Bycroft, C. Freeman, D. Petkova, G. Band, L. T. Elliott et al., The UK biobank resource with deep phenotyping and genomic data, Nature, vol.562, issue.7726, p.203, 2018.

C. S. Carlson, T. C. Matise, K. E. North, C. A. Haiman, M. D. Fesinmeyer et al., Generalization and dilution of association results from european gwas in populations of non-european ancestry: the page study, PLoS biology, vol.11, issue.9, p.1001661, 2013.

C. C. Chang, C. C. Chow, L. C. Tellier, S. Vattikuti, S. M. Purcell et al., Secondgeneration PLINK: rising to the challenge of larger and richer datasets, Gigascience, vol.4, issue.1, p.7, 2015.

D. Chasioti, J. Yan, K. Nho, and A. J. Saykin, Progress in polygenic composite scores in alzheimer's and other complex diseases, Trends in Genetics, 2019.

T. Chen and C. Guestrin, XGBoost: A scalable tree boosting system, Proceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining, pp.785-794, 2016.

S. W. Choi, T. S. Mak, O. , and P. , A guide to performing polygenic risk score analyses, 2018.

. Biorxiv, , p.416545

S. Chun, M. Imakaev, D. Hui, N. A. Patsopoulos, B. M. Neale et al., Non-parametric polygenic risk prediction using partitioned gwas summary statistics, BioRxiv, p.370064, 2019.

M. A. Coram, H. Fang, S. I. Candille, T. L. Assimes, and H. Tang, Leveraging multi-ethnic evidence for risk assessment of quantitative traits in minority populations, The American Journal of Human Genetics, vol.101, issue.2, pp.218-226, 2017.

C. E. Desantis, S. A. Fedewa, A. Sauer, J. L. Kramer, R. A. Smith et al., Convergence of incidence rates between black and white women, vol.66, pp.31-42, 2015.

S. Dey, R. Gupta, M. Steinbach, and V. Kumar, Integration of clinical and genomic data: a methodological survey, Briefings in Bioinformatics, 2013.

F. Dudbridge, Power and Predictive Accuracy of Polygenic Risk Scores, PLoS Genetics, issue.3, p.9, 2013.

, Asking for more, Nature Genetics, vol.44, issue.7, pp.733-733, 2012.

J. Euesden, C. M. Lewis, and P. F. Reilly, PRSice: Polygenic Risk Score software. Bioinformatics, vol.31, pp.1466-1468, 2015.

C. Fuchsberger, J. Flannick, T. M. Teslovich, A. Mahajan, V. Agarwala et al., The genetic architecture of type 2 diabetes, Nature, vol.536, issue.7614, p.41, 2016.

T. Ge, C. Chen, Y. Ni, Y. A. Feng, and J. W. Smoller, Polygenic prediction via bayesian regression and continuous shrinkage priors. bioRxiv, p.416859, 2019.

G. Gibson, The environmental contribution to gene expression profiles, Nature reviews genetics, vol.9, issue.8, p.575, 2008.

D. Golan and S. Rosset, Effective genetic-risk prediction using mixed models, The American Journal of Human Genetics, vol.95, issue.4, pp.383-393, 2014.

B. Goudey, G. Abraham, E. Kikianty, Q. Wang, D. Rawlinson et al., Interactions within the mhc contribute to the genetic architecture of celiac disease, PloS one, vol.12, issue.3, p.172826, 2017.

B. M. Grande, A. Baghela, A. Cavalla, F. Privé, P. Zhang et al., Hackathon-driven tutorial development, 1000.
URL : https://hal.archives-ouvertes.fr/hal-02272044

A. Hamosh, A. F. Scott, J. S. Amberger, C. A. Bocchini, and V. A. Mckusick, Online mendelian inheritance in man (OMIM), a knowledgebase of human genes and genetic disorders, Nucleic acids research, vol.33, issue.suppl_1, pp.514-517, 2005.

W. G. Hill, M. E. Goddard, and P. M. Visscher, Data and theory point to mainly additive genetic variance for complex traits, PLoS genetics, vol.4, issue.2, p.1000008, 2008.

S. Horvath, Dna methylation age of human tissues and cell types, Genome biology, vol.14, issue.10, p.3156, 2013.

S. Horvath and K. Raj, Dna methylation-based biomarkers and the epigenetic clock theory of ageing, Nature Reviews Genetics, p.1, 2018.

R. R. Hudson, Two-locus sampling distributions and their application, Genetics, vol.159, issue.4, pp.1805-1817, 2001.

J. R. Huyghe, S. A. Bien, T. A. Harrison, H. M. Kang, S. Chen et al., Discovery of common and rare genetic risk variants for colorectal cancer, Nature genetics, vol.51, issue.1, p.76, 2019.
URL : https://hal.archives-ouvertes.fr/hal-02153497

M. Inouye, G. Abraham, C. P. Nelson, A. M. Wood, M. J. Sweeting et al., Genomic risk prediction of coronary artery disease in 480,000 adults: implications for primary prevention, Journal of the American College of Cardiology, vol.72, issue.16, pp.1883-1893, 2018.

J. Jakobsdottir, M. B. Gorin, Y. P. Conley, R. E. Ferrell, and D. E. Weeks, Interpretation of genetic association studies: markers with replicated highly significant odds ratios may be poor classifiers, PLoS genetics, vol.5, issue.2, p.1000337, 2009.

A. C. Janssens and M. J. Joyner, Polygenic risk scores that predict common diseases using millions of single nucleotide polymorphisms: Is more, better? Clinical chemistry, p.2018, 2019.

A. C. Janssens, Y. S. Aulchenko, S. Elefante, G. J. Borsboom, E. W. Steyerberg et al., Predictive testing for complex diseases using multiple genes: fact or fiction?, Genetics in medicine, vol.8, issue.7, p.395, 2006.

K. B. Kuchenbaecker, J. L. Hopper, D. R. Barnes, K. Phillips, T. M. Mooij et al., Risks of breast, ovarian, and contralateral breast cancer for BRCA1 and BRCA2 mutation carriers, Jama, issue.23, pp.2402-2416, 2017.

L. Lello, S. G. Avery, L. Tellier, A. I. Vazquez, G. De-los-campos et al., Accurate genomic prediction of human height, Genetics, vol.210, issue.2, pp.477-497, 2018.

T. L. Lenz, A. J. Deutsch, B. Han, X. Hu, Y. Okada et al., Widespread non-additive and interaction effects within hla loci modulate the risk of autoimmune diseases, Nature genetics, vol.47, issue.9, p.1085, 2015.

D. Lin and D. Zeng, Meta-analysis of genome-wide association studies: no efficiency gain in using individual participant data. Genetic Epidemiology: The Official Publication of the International Genetic Epidemiology Society, vol.34, pp.60-66, 2010.

P. Loh, G. Kichaev, S. Gazal, A. P. Schoech, and A. L. Price, Mixed-model association for biobank-scale datasets, Nature genetics, vol.50, issue.7, p.906, 2018.

K. Luu, E. Bazin, and M. G. Blum, pcadapt: an R package to perform genome scans for selection based on principal component analysis, Molecular ecology resources, vol.17, issue.1, pp.67-77, 2017.
URL : https://hal.archives-ouvertes.fr/hal-01430346

R. Mägi, M. Horikoshi, T. Sofer, A. Mahajan, H. Kitajima et al., Trans-ethnic meta-regression of genome-wide association studies accounting for ancestry increases power for discovery and improves fine-mapping resolution, Human Molecular Genetics, vol.26, issue.18, pp.3639-3650, 2017.

R. Maier, G. Moser, G. Chen, S. Ripke, D. Absher et al., Joint analysis of psychiatric disorders increases accuracy of risk prediction for schizophrenia, bipolar disorder, and major depressive disorder, The American Journal of Human Genetics, vol.96, issue.2, pp.283-294, 2015.
URL : https://hal.archives-ouvertes.fr/hal-01541309

T. S. Mak, R. M. Porsch, S. W. Choi, X. Zhou, and P. C. Sham, Polygenic scores via penalized regression on summary statistics, Genetic epidemiology, issue.6, pp.469-480, 2017.

T. A. Manolio, F. S. Collins, N. J. Cox, D. B. Goldstein, L. A. Hindorff et al., Finding the missing heritability of complex diseases, Nature, issue.7265, p.747, 2009.

J. Marchini and B. Howie, Genotype imputation for genome-wide association studies, Nature Reviews Genetics, vol.11, issue.7, p.499, 2010.

E. Marouli, M. Graff, C. Medina-gomez, K. S. Lo, A. R. Wood et al., Rare and low-frequency coding variants alter human adult height, Nature, vol.542, issue.7640, p.186, 2017.

C. Márquez-luna, P. Loh, and S. A. Consortium,

D. S. Consortium, S. T. Price, and A. L. , Multiethnic polygenic risk scores improve risk prediction in diverse populations, Genetic epidemiology, issue.8, pp.811-823, 2017.

A. R. Martin, C. R. Gignoux, R. K. Walters, G. L. Wojcik, B. M. Neale et al., Human demographic history impacts genetic risk prediction across diverse populations, The American Journal of Human Genetics, vol.100, issue.4, pp.635-649, 2017.

A. R. Martin, M. Kanai, Y. Kamatani, Y. Okada, B. M. Neale et al., Clinical use of current polygenic risk scores may exacerbate health disparities, Nature genetics, vol.51, issue.4, p.584, 2019.

N. Mavaddat, P. D. Pharoah, K. Michailidou, J. Tyrer, M. N. Brook et al., Prediction of breast cancer risk based on profiling with common genetic variants, JNCI: Journal of the National Cancer Institute, issue.5, p.107, 2015.

S. Mccarthy, S. Das, W. Kretzschmar, O. Delaneau, A. R. Wood et al., A reference panel of 64,976 haplotypes for genotype imputation, Nature genetics, vol.48, issue.10, p.1279, 2016.

R. L. Minster, N. L. Hawley, C. Su, G. Sun, E. E. Kershaw et al., A thrifty variant in crebrf strongly influences body mass index in samoans, Nature genetics, vol.48, issue.9, p.1049, 2016.

I. Moltke, N. Grarup, M. E. Jørgensen, P. Bjerregaard, J. T. Treebak et al., A common greenlandic tbc1d4 variant confers muscle insulin resistance and type 2 diabetes, Nature, vol.512, issue.7513, p.190, 2014.

A. C. Need and D. B. Goldstein, Next generation disparities in human genomics: concerns and remedies, Trends in Genetics, vol.25, issue.11, pp.489-494, 2009.

M. R. Nelson, K. Bryc, K. S. King, A. Indap, A. R. Boyko et al., The population reference sample, popres: a resource for population, disease, and pharmacological genetics research, The American Journal of Human Genetics, vol.83, issue.3, pp.347-358, 2008.

C. Niel, C. Sinoquet, C. Dina, and G. Rocheleau, A survey about methods dedicated to epistasis detection, Frontiers in genetics, vol.6, p.285, 2015.
URL : https://hal.archives-ouvertes.fr/hal-01205577

M. Nikpay, A. Goel, H. Won, L. M. Hall, C. Willenborg et al., A comprehensive 1000 genomes-based genome-wide association meta-analysis of coronary artery disease, Nature genetics, vol.47, issue.10, p.1121, 2015.

S. Okser, T. Pahikkala, A. Airola, T. Salakoski, S. Ripatti et al., Regularized machine learning in the genetic prediction of complex traits, PLoS genetics, vol.10, issue.11, p.1004754, 2014.

B. Pasaniuc and A. L. Price, Dissecting the genetics of complex traits using summary association statistics, Nature Reviews Genetics, vol.18, issue.2, p.117, 2017.

B. Pasaniuc, N. Rohland, P. J. Mclaren, K. Garimella, N. Zaitlen et al., Extremely low-coverage sequencing and imputation increases power for genome-wide association studies, Nature genetics, vol.44, issue.6, p.631, 2012.

B. Pasaniuc, N. Zaitlen, H. Shi, G. Bhatia, A. Gusev et al., Fast and accurate imputation of summary statistics enhances evidence of functional enrichment, Bioinformatics, vol.30, issue.20, pp.2906-2914, 2014.

N. Pashayan, S. W. Duffy, D. E. Neal, F. C. Hamdy, J. L. Donovan et al., Implications of polygenic risk-stratified screening for prostate cancer on overdiagnosis, Genetics in Medicine, vol.17, issue.10, p.789, 2015.

I. Pe'er, R. Yelensky, D. Altshuler, and M. J. Daly, Estimation of the multiple testing burden for genomewide association studies of nearly all common variants, Genetic epidemiology, vol.32, issue.4, pp.381-385, 2008.

M. S. Pepe, H. Janes, G. Longton, W. Leisenring, and P. Newcomb, Limitations of the odds ratio in gauging the performance of a diagnostic, prognostic, or screening marker, American journal of epidemiology, vol.159, issue.9, pp.882-890, 2004.

A. B. Popejoy and S. M. Fullerton, Genomics is failing on diversity, Nature News, vol.538, issue.7624, p.161, 2016.

A. L. Price, N. J. Patterson, R. M. Plenge, M. E. Weinblatt, N. A. Shadick et al., Principal components analysis corrects for stratification in genome-wide association studies, Nature genetics, vol.38, issue.8, p.904, 2006.

J. K. Pritchard and N. J. Cox, The allelic architecture of human disease genes: common diseasecommon variant...or not? Human molecular genetics, vol.11, pp.2417-2423, 2002.

F. Privé, H. Aschard, A. Ziyatdinov, and M. G. Blum, Efficient analysis of large-scale genome-wide data with two R packages: bigstatsr and bigsnpr, Bioinformatics, vol.34, issue.16, pp.2781-2787, 2018.

F. Privé, H. Aschard, and M. G. Blum, Efficient implementation of penalized regression for genetic risk prediction, Genetics, p.302019, 2019.

S. L. Pulit, B. F. Voight, D. Bakker, and P. I. , Multiethnic genetic association studies improve power for locus discovery, PloS one, vol.5, issue.9, p.12600, 2010.

S. Purcell, B. Neale, K. Todd-brown, L. Thomas, M. A. Ferreira et al., PLINK: a tool set for whole-genome association and population-based linkage analyses, The American journal of human genetics, vol.81, issue.3, pp.559-575, 2007.

S. M. Purcell, N. R. Wray, J. L. Stone, P. M. Visscher, M. C. O'donovan et al., Common polygenic variation contributes to risk of schizophrenia and bipolar disorder, Nature, vol.460, issue.7256, pp.748-752, 2009.

. R-core-team, R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, 2018.

S. Reisberg, T. Iljasenko, K. Läll, K. Fischer, and J. Vilo, Comparing distributions of polygenic risk scores of type 2 diabetes and coronary heart disease within different populations, PloS one, vol.12, issue.7, p.179238, 2017.

D. M. Roden and J. C. Denny, Integrating electronic health record genotype and phenotype datasets to transform patient care, Clinical Pharmacology & Therapeutics, vol.99, issue.3, pp.298-305, 2016.

K. Salari, H. Watkins, A. , and E. A. , Personalized medicine: hope or hype?, European heart journal, vol.33, issue.13, pp.1564-1570, 2012.

R. A. Scott, L. J. Scott, R. Mägi, L. Marullo, K. J. Gaulton et al., An expanded genome-wide association study of type 2 diabetes in europeans, Diabetes, vol.66, issue.11, pp.2888-2902, 2017.
URL : https://hal.archives-ouvertes.fr/hal-01832132

K. Sikorska, E. Lesaffre, P. F. Groenen, and P. H. Eilers, Gwas on your notebook: fast semiparallel linear and logistic regression for genome-wide association studies, BMC bioinformatics, vol.14, issue.1, p.166, 2013.

K. Silventoinen, Determinants of variation in adult body height, Journal of biosocial science, vol.35, issue.2, pp.263-285, 2003.

M. Sohail, R. M. Maier, A. Ganna, A. Bloemendal, A. R. Martin et al., Polygenic adaptation on height is overestimated due to uncorrected stratification in genome-wide association studies, vol.8, p.39702, 2019.

D. Speed and D. J. Balding, Multiblup: improved snp-based prediction for complex traits, 2014.

, Genome research, vol.24, issue.9, pp.1550-1557

D. Speed and D. J. Balding, SumHer better estimates the SNP heritability of complex traits from summary statistics, Nature genetics, p.1, 2018.

D. Taliun, D. N. Harris, M. D. Kessler, J. Carlson, Z. A. Szpiech et al., Sequencing of 53,831 diverse genomes from the NHLBI TOPMed program, BioRxiv, p.563866, 2019.

V. Tam, N. Patel, M. Turcotte, Y. Bossé, G. Paré et al., Benefits and limitations of genome-wide association studies, Nature Reviews Genetics, p.1, 2019.

R. Tibshirani, J. Bien, J. Friedman, T. Hastie, N. Simon et al., Strong rules for discarding predictors in lasso-type problems, Journal of the Royal Statistical Society: Series B (Statistical Methodology), vol.74, issue.2, pp.245-266, 2012.

E. J. Topol, Individualized medicine from prewomb to tomb, Cell, vol.157, issue.1, pp.241-253, 2014.

A. Torkamani, N. E. Wineinger, and E. J. Topol, The personal and clinical utility of polygenic risk scores, Nature Reviews Genetics, p.1, 2018.

B. J. Vilhjálmsson, J. Yang, H. K. Finucane, A. Gusev, S. Lindström et al., Modeling linkage disequilibrium increases accuracy of polygenic risk scores, The American Journal of Human Genetics, vol.97, issue.4, pp.576-592, 2015.

P. M. Visscher, S. E. Medland, M. A. Ferreira, K. I. Morley, G. Zhu et al., Assumption-free estimation of heritability from genome-wide identity-by-descent sharing between full siblings, PLoS genetics, vol.2, issue.3, p.41, 2006.

P. M. Visscher, W. G. Hill, and N. R. Wray, Heritability in the genomics era -concepts and misconceptions, Nature reviews genetics, vol.9, issue.4, p.255, 2008.

P. M. Visscher, N. R. Wray, Q. Zhang, P. Sklar, M. I. Mccarthy et al., 10 years of GWAS discovery: biology, function, and translation, The American Journal of Human Genetics, vol.101, issue.1, pp.5-22, 2017.

P. Wainschtein, D. P. Jain, L. Yengo, Z. Zheng, L. A. Cupples et al., Recovery of trait heritability from whole genome sequence data, p.588020, 2019.

N. J. Wald and R. Old, The illusion of polygenic disease risk prediction, Genetics in Medicine, p.1, 2019.

Z. Wei, K. Wang, H. Qu, H. Zhang, J. Bradfield et al., From disease association to risk assessment: an optimistic view from genome-wide association studies on type 1 diabetes, PLoS genetics, vol.5, issue.10, p.1000678, 2009.

, Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls, Nature, vol.447, issue.7145, p.661, 2007.

D. Welter, J. Macarthur, J. Morales, T. Burdett, P. Hall et al., The NHGRI GWAS Catalog, a curated resource of SNP-trait associations, Nucleic acids research, vol.42, issue.D1, pp.1001-1006, 2013.

G. Wojcik, M. Graff, K. K. Nishimura, R. Tao, J. Haessler et al., The page study: How genetic diversity improves our understanding of the architecture of complex traits. bioRxiv, p.188094, 2018.

A. R. Wood, T. Esko, J. Yang, S. Vedantam, T. H. Pers et al., Defining the role of common variation in the genomic and biological architecture of adult human height, Nature genetics, vol.46, issue.11, p.1173, 2014.

N. R. Wray, M. E. Goddard, and P. M. Visscher, Prediction of individual genetic risk of complex disease. Current opinion in genetics & development, vol.18, pp.257-263, 2008.

N. R. Wray, S. H. Lee, D. Mehta, A. A. Vinkhuyzen, F. Dudbridge et al., Research review: polygenic methods and their application to psychiatric traits, Journal of Child Psychology and Psychiatry, vol.55, issue.10, pp.1068-1087, 2014.

N. R. Wray, C. Wijmenga, P. F. Sullivan, J. Yang, and P. M. Visscher, Common disease is more complex than implied by the core gene omnigenic model, Cell, vol.173, issue.7, pp.1573-1580, 2018.

J. Yang, B. Benyamin, B. P. Mcevoy, S. Gordon, A. K. Henders et al., Common SNPs explain a large proportion of the heritability for human height, Nature genetics, vol.42, issue.7, p.565, 2010.

J. Yang, N. A. Zaitlen, M. E. Goddard, P. M. Visscher, and A. L. Price, Advantages and pitfalls in the application of mixed-model association methods, Nature genetics, vol.46, issue.2, p.100, 2014.

J. Yang, A. Bakshi, Z. Zhu, G. Hemani, A. A. Vinkhuyzen et al., Genetic variance estimation with imputed variants finds negligible missing heritability for human height and body mass index, Nature genetics, vol.47, issue.10, p.1114, 2015.

W. Zhou, J. B. Nielsen, L. G. Fritsche, R. Dey, M. E. Gabrielsen et al., Efficiently controlling for case-control imbalance and sample relatedness in large-scale genetic association studies, Nature genetics, vol.50, issue.9, p.1335, 2018.

X. Zhou, P. Carbonetto, and M. Stephens, Polygenic modeling with bayesian sparse linear mixed models, PLoS genetics, vol.9, issue.2, p.1003264, 2013.

H. Zou, The adaptive lasso and its oracle properties, Journal of the American statistical association, vol.101, issue.476, pp.1418-1429, 2006.

O. Zuk, E. Hechter, S. R. Sunyaev, and E. S. Lander, The mystery of missing heritability: Genetic interactions create phantom heritability, Proceedings of the National Academy of Sciences, vol.109, issue.4, pp.1193-1198, 2012.

O. Zuk, S. F. Schaffner, K. Samocha, R. Do, E. Hechter et al., Searching for missing heritability: designing rare variant association studies, Proceedings of the National Academy of Sciences, vol.111, issue.4, pp.455-464, 2014.