A. , M. &. Ghanem, and M. , Scientific Data Mining and Knowledge Discovery, pp.207-247, 2009.

A. , F. Cirillo, D. Ponti, R. D. Tartaglia, and G. G. , SeAMotE: a method for high-throughput motif discovery in nucleic acid sequences, BMC genomics, vol.15, p.925, 2014.

A. , C. Coe, B. P. Eichler, and E. E. , Genome structural variation discovery and genotyping, Nature Reviews Genetics, vol.12, p.363, 2011.

A. and M. , Discovering motifs that induce sequencing errors in BMC bioinformatics, vol.14, p.1, 2013.

A. , N. Bogard, M. Lamoril, and J. , Principes de biologie moléculaire en biologie clinique ISBN : 9782842996857, 2005.

A. and S. E. , CpG dinucleotides and human disorders, eLS, 2006.

B. , R. Navarro, and G. , Faster approximate string matching, Algorithmica, vol.23, pp.127-158, 1999.

B. and T. L. , DREME: motif discovery in transcription factor ChIP-seq data, Bioinformatics, vol.27, pp.1653-1659, 2011.

B. and T. L. , MEME SUITE: tools for motif discovery and searching, Nucleic acids research, p.335, 2009.

B. , A. Cellier, C. Samaha, E. Laurent-puig, P. Lecuru et al., Lynch syndrome: Towards a multidisciplinary management of tumour screening. Gynecologie, obstetrique & fertilite, vol.39, p.272, 2011.

. Bibliographie-23, L. E. Baum, T. Petrie, G. Soules, and N. Weiss, A maximization technique occurring in the statistical analysis of probabilistic functions of Markov chains. The annals of mathematical statistics 41, pp.164-171, 1970.

B. , J. Urnovitz, H. B. Mitchell, W. M. Schütz, and E. , Next generation sequencing of serum circulating nucleic acids from patients with invasive ductal breast cancer reveals differences to healthy and nonmalignant controls, Molecular cancer research, vol.8, pp.335-342, 2010.

B. and M. J. , Module 4: Sensors and Effectors, Cell Signalling Biology, vol.6, pp.1749-7787, 2014.

B. , M. Boeckx, N. Van-camp, G. De, B. et al., pyAmpli: an amplicon-based variant filter pipeline for targeted resequencing data, BMC bioinformatics, vol.18, p.554, 2017.

B. and A. P. , DNA methylation and the frequency of CpG in animal DNA, Nucleic Acids Research, vol.8, pp.1499-1504, 1980.

R. S. Boyer and J. S. Moore, A fast string searching algorithm, Communications of the ACM, vol.20, pp.762-772, 1977.

B. and R. P. , Algorithms for minimization without derivatives (Courier Corporation, 2013.

B. , M. Guigo, and R. , Evaluation of gene structure prediction programs, genomics, vol.34, pp.353-367, 1996.

C. , J. Majewski, and J. Lolopicker, Detecting Low-Fraction Variants in Low-Quality Cancer Samples from Whole-exome Sequencing Data, p.43612, 2016.

C. , Q. K. Hertz, G. Z. Stormo, and G. D. , MATRIX SEARCH 1.0: a computer program that scans DNA sequences for transcriptional elements using a database of weight matrices, Bioinformatics, vol.11, pp.563-566, 1995.

C. and X. , Integration of external signaling pathways with the core transcriptional network in embryonic stem cells, Cell, vol.133, pp.1106-1117, 2008.

C. , A. Y. Teo, Y. Ong, R. T. , and .. , Assessing single nucleotide variant detection and genotype calling on whole-genome sequenced individuals. Bioinformatics, vol.30, pp.1707-1713, 2014.

C. , L. Lopez-beltran, A. Massari, F. Maclennan, G. T. Montironi et al., Molecular testing for BRAF mutations to inform melanoma treatment decisions: a move toward precision medicine, Modern Pathology, vol.31, p.24, 2018.

C. and Y. , Erythroid GATA1 function revealed by genome-wide analysis of transcription factor occupancy, histone modifications, and mRNA expression

, Genome research, vol.19, pp.2172-2184, 2009.

C. and R. W. , Non-invasive prenatal assessment of trisomy 21 by multiplexed maternal plasma DNA sequencing: large scale validity study, Bmj, vol.342, p.7401, 2011.

C. , N. Syntactic, . Structures, L. Mouton, and . Haye, Trad. fr.: Le Seuil, 1957.

C. and K. , Sensitive detection of somatic point mutations in impure and heterogeneous cancer samples, Nature biotechnology, vol.31, p.213, 2013.

C. and S. , Genetic mutation, Nature Education, vol.1, p.187, 2008.

C. , J. Chavarro, C. Pearl, S. A. Ozias-akins, P. Jackson et al., Single nucleotide polymorphism identification in polyploids: a review, example, and recommendations, Molecular plant, vol.8, pp.831-846, 2015.

C. and 1. G. , A map of human genome variation from populationscale sequencing, Nature, vol.467, p.1061, 2010.

C. and I. H. , Initial sequencing and analysis of the human genome, Nature, vol.409, p.860, 2001.

C. and L. , VHL mosaicism can be detected by clinical next-generation sequencing and is not restricted to patients with a mild phenotype, European Journal of Human Genetics, vol.22, p.1149, 2014.

C. , M. Rytter, and W. , , 1994.

C. , M. Hancart, C. Lecroq, and T. , , 2001.

. Bibliographie,

E. and S. R. , Hidden markov models, Current opinion in structural biology, vol.6, pp.361-365, 1996.

E. and S. R. , Profile hidden Markov models, Bioinformatics, vol.14, pp.755-763, 1998.

E. , R. Domrachev, M. Lash, and A. E. , Gene Expression Omnibus: NCBI gene expression and hybridization array data repository, Nucleic acids research, vol.30, pp.207-210, 2002.

E. , O. Slonim, N. Tavazoie, and S. , A universal framework for regulatory element discovery across all genomes and data types, Molecular cell, vol.28, pp.337-350, 2007.

F. , H. C. Blumenfeld, Y. J. Chitkara, U. Hudgins, L. Quake et al., Noninvasive diagnosis of fetal aneuploidy by shotgun sequencing DNA from maternal blood, Proceedings of the National Academy of Sciences, vol.105, pp.16266-16271, 2008.

F. , P. Manzini, and G. , Indexing compressed text, Journal of the ACM (JACM), vol.52, pp.552-581, 2005.

F. , L. Carson, A. R. Scherer, and S. W. , Structural variation in the human genome, Nature Reviews Genetics, vol.7, p.85, 2006.

F. and R. A. , On the interpretation of ? 2 from contingency tables, and the calculation of P, Journal of the Royal Statistical Society, vol.85, pp.87-94, 1922.

F. , W. Real, and F. , Many mosaic mutations, Current Oncology, vol.20, p.85, 2013.

F. , E. J. Reid-bayliss, K. S. Emond, M. J. Loeb, and L. A. , Accuracy of next generation sequencing platforms. Next generation, sequencing & applications, p.1, 2014.

F. , Y. M. Mandel, Y. Friedberg, D. Margalit, and H. , Identification of common motifs in unaligned DNA sequences: application to Escherichia coli Lrp regulon, Bioinformatics, vol.11, pp.379-387, 1995.

F. , T. Mauillon, J. Thomas, G. Olschwang, and S. , Hereditary nonpolyposis colorectal cancer. Definition, genetics, diagnosis, and medical surveillance, Gastroenterologie clinique et biologique, vol.27, p.708, 2003.

. Bibliographie-71, E. Garrison, and G. Marth, Haplotype-based variant detection from short-read sequencing, 2012.

G. and K. S. , An overview of chemical processes that damage cellular DNA: spontaneous hydrolysis, alkylation, and reactions with radicals, Chemical research in toxicology, vol.22, pp.1747-1760, 2009.

G. and M. , Reliable detection of subclonal single-nucleotide variants in tumour cell populations, Nature communications, vol.3, p.811, 2012.

G. and S. , Oncogènes et leucémies: historique et perspectives. médecine/sciences 19, pp.201-210, 2003.

G. and T. C. , Field guide to next-generation DNA sequencers, Molecular ecology resources, vol.11, pp.759-769, 2011.

G. , B. Dawy, Z. Hagenauer, J. Mueller, and J. C. , An approximation to the distribution of finite sample size mutual information estimates in Communications, IEEE, vol.2, pp.1102-1106, 2005.

G. , S. Mcpherson, J. D. Mccombie, and W. R. , Coming of age: ten years of next-generation sequencing technologies, Nature Reviews Genetics, vol.17, p.333, 2016.

G. , A. Miller, J. Suzuki, D. Lewontin, R. Gelbart et al., Spontaneous mutations, An Introduction to Genetic Analysis, 2000.

G. and A. , Introduction à l'analyse génétique ISBN : 9782744500978, 2002.

G. , W. N. Bailey, T. L. Elkan, C. P. Baker, and M. E. , Meta-MEME: motif-based hidden Markov models of protein families, Bioinformatics, vol.13, pp.397-406, 1997.

H. , M. Khiabanian, and H. , MERIT reveals the impact of genomic context on sequencing error rate in ultra-deep applications, BMC Bioinformatics, vol.19, p.219, 2018.

H. and R. , DNA-damage repair; the good, the bad, and the ugly, The EMBO journal, vol.27, pp.589-605, 2008.

H. and C. T. , Transcriptional regulatory code of a eukaryotic genome, Nature, vol.431, p.99, 2004.

. Bibliographie,

H. and S. , Simple combinations of lineage-determining transcription factors prime cis-regulatory elements required for macrophage and B cell identities, Molecular cell, vol.38, pp.576-589, 2010.

H. , J. V. Rios, A. Collado-vides, and J. , Discovering regulatory elements in non-coding sequences by analysis of spaced dyads. Nucleic acids research 28, pp.1808-1818, 2000.
URL : https://hal.archives-ouvertes.fr/hal-01624381

H. , V. J. Bandrowski, A. E. Pepin, A. Gonzalez, B. J. Desfeux et al., OMICtools: an informative directory for multi-omic data analysis, Database, 2014.
URL : https://hal.archives-ouvertes.fr/inserm-01026133

H. , R. Grigg, and G. , DNA methylation and mutation. Mutation Research/ Fundamental and Molecular Mechanisms of, Mutagenesis, vol.285, pp.61-67, 1993.

H. and S. , A Simple Sequentially Rejective Multiple Test Procedure, Scandinavian Journal of Statistics, vol.6, pp.65-70, 1979.

H. , J. D. Estep, P. W. Tavazoie, S. Church, and G. M. , Computational identification of cis-regulatory elements associated with groups of functionally related genes in Saccharomyces cerevisiae, Journal of molecular biology, vol.296, pp.1205-1214, 2000.

H. and M. , Distribution of mutual information in Advances in neural information processing systems, pp.399-406, 2002.

H. , S. Kim, E. Lee, I. Marcotte, and E. M. , Systematic comparison of variant calling pipelines using gold standard personal exome variants, Scientific reports, vol.5, p.17875, 2015.

I. , Z. Caccamo, M. Turner, I. Flicek, P. Mcvean et al., De novo assembly and genotyping of variants using colored de Bruijn graphs, Nature genetics, vol.44, p.226, 2012.

J. and H. J. , Controlled 15-year trial on screening for colorectal cancer in families with hereditary nonpolyposis colorectal cancer, Gastroenterology, vol.118, pp.829-834, 2000.

J. and I. , Efficient discovery of conserved patterns using a pattern graph, Bioinformatics, vol.13, pp.509-522, 1997.

. Bibliographie-95 and T. Kanagawa, Bias and artifacts in multitemplate polymerase chain reactions (PCR), Journal of bioscience and bioengineering, vol.96, pp.317-323, 2003.

K. , G. Isawa, J. Marshall, and W. Biologie-cellulaire, , p.9782807308015, 2018.

K. and A. , Addressing NGS Data Challenges: Efficient High Throughput Processing and Sequencing Error Detection thèse de doct, 2015.

K. Jr, T. J. Smith, and H. O. , A restriction enzyme from Hemophilus influenzae: II. Base sequence of the recognition site, Journal of molecular biology, vol.51, pp.393-409, 1970.

K. and A. P. , ?-Synucleinopathy associated with G51D SNCA mutation: A link between Parkinson's disease and multiple system atrophy? Acta neuropathologica 125, pp.753-769, 2013.

K. , T. Bhattacharya, S. Braunstein, J. Schindler, and C. , Signaling through the JAK/STAT pathway, recent advances and future challenges, Gene, vol.285, pp.1-24, 2002.

K. , L. Bablok, and B. , Computation of the noncentral gamma distribution, SIAM Journal on Scientific Computing, vol.17, pp.1224-1231, 1996.

K. , D. E. Morris, J. H. Pratt, and V. R. , Fast pattern matching in strings, SIAM journal on computing, vol.6, pp.323-350, 1977.

K. O. , L. Engel, and J. , DNA-binding specificities of the GATA transcription factor family, Molecular and cellular biology, vol.13, pp.4011-4022, 1993.

K. and D. C. , VarScan 2: somatic mutation and copy number alteration discovery in cancer by exome sequencing, Genome research, vol.22, pp.568-576, 2012.

K. and A. , The Interlaboratory RObustness of Next-generation sequencing (IRON) study: a deep sequencing investigation of TET2, CBL and KRAS mutations by an international consortium involving 10 laboratories, Leukemia, vol.25, p.1840, 2011.

K. and S. , Handbook of Open Source Tools, pp.127-143, 2011.

. Bibliographie,

K. , A. Brown, M. Mian, I. S. Sjölander, K. Haussler et al., Hidden Markov models in computational biology: Applications to protein modeling, Journal of molecular biology, vol.235, pp.1501-1531, 1994.

K. and S. , Approximate string searching under weighted edit distance in Proc, WSP, vol.96, pp.156-170, 1996.

L. , G. M. Vishkin, and U. , Fast parallel and serial approximate string matching, Journal of algorithms, vol.10, pp.157-169, 1989.

L. , B. Salzberg, and S. L. , Fast gapped-read alignment with Bowtie 2, Nature methods, vol.9, p.357, 2012.

L. , N. Bleicher-bardeletti, F. Duclos, B. Vamecq, and J. , Biochimie-Tout le cours en fiches-2e éd: 200 fiches de cours, 155 QCM, sujets de synthèse et ressources en ligne ISBN : 9782100765959, 2017.

L. and C. E. , Detecting subtle sequence signals: a Gibbs sampling strategy for multiple alignment. SCIENCE-NEW YORK THEN WASHINGTON-262, pp.208-208, 1993.

L. and B. , Ultra-deep sequencing of foraminiferal microbarcodes unveils hidden richness of early monothalamous lineages in deep-sea sediments, Proceedings of the National Academy of Sciences, vol.108, pp.13177-13182, 2011.

L. , R. Sugawara, H. Shumway, M. Collaboration, and I. N. , The sequence read archive, Nucleic acids research, vol.39, pp.19-21, 2010.

L. I. and H. , Toward better understanding of artifacts in variant calling from highcoverage samples, Bioinformatics, vol.30, pp.2843-2851, 2014.

L. I. , H. Durbin, and R. , Fast and accurate short read alignment with BurrowsWheeler transform, Bioinformatics, vol.25, pp.1754-1760, 2009.

L. , J. S. Neuwald, A. F. Lawrence, and C. E. , Bayesian models for multiple local sequence alignment and Gibbs sampling strategies, Journal of the American Statistical Association, vol.90, pp.1156-1170, 1995.

L. , X. Han, S. Wang, Z. Gelernter, J. Yang et al., Variant callers for next-generation sequencing data: a comparison study, PloS one, vol.8, p.75619, 2013.

L. , X. Brutlag, D. L. Liu, and J. S. , BioProspector: discovering conserved DNA motifs in upstream regulatory regions of co-expressed genes, Pacific symposium on biocomputing, vol.6, pp.127-138, 2001.

L. , X. Wu, J. Gu, F. Wang, J. He et al., Discriminative pattern mining and its applications in bioinformatics, Briefings in bioinformatics, vol.16, pp.884-900, 2014.

L. and N. J. , Performance comparison of benchtop high-throughput sequencing platforms, Nature biotechnology, vol.30, p.434, 2012.

M. and J. , Discriminative learning for probabilistic sequence analysis thèse de doct. (Dept. of Computational Molecular Biology (Head: Martin Vingron), Max Planck Institute for Molecular Genetics, 2015.

M. , J. Rajewsky, and N. , Binding site discovery from nucleic acid sequences by discriminative learning of Hidden Markov Models. Nucleic acids research 42, pp.12995-13011, 2014.

M. , P. Bailey, and T. L. , MEME-ChIP: motif analysis of large DNA datasets, Bioinformatics, vol.27, pp.1696-1697, 2011.

M. and R. , Metagenomic analysis of a permafrost microbial community reveals a rapid response to thaw, Nature, vol.480, p.368, 2011.

M. , U. Myers, and G. , Suffix arrays: a new method for on-line string searches, siam Journal on Computing, vol.22, pp.935-948, 1993.

M. and A. , Extraction de Motifs Communs dans un Ensemble de Séquences

, Application à l'identification de sites de liaison aux protéines dans les séquences primaires d'ADN. thèse de doct, 2006.

M. , T. Rahmann, and S. , Efficient exact motif discovery, Bioinformatics, vol.25, pp.356-364, 2009.

M. and A. , JASPAR 2016: a major expansion and update of the openaccess database of transcription factor binding profiles, Nucleic acids research, vol.44, pp.110-115, 2016.

M. and E. M. , A space-economical suffix tree construction algorithm, Journal of the ACM (JACM), vol.23, pp.262-272, 1976.

M. and F. , Identification and correction of systematic error in highthroughput sequence data, BMC bioinformatics, vol.12, p.451, 2011.

M. and A. , RSAT 2015: regulatory sequence analysis tools. Nucleic acids research, p.362, 2015.

M. and C. A. , SciClone: inferring clonal architecture and tracking the spatial and temporal patterns of tumor evolution, PLoS computational biology, vol.10, p.1003665, 2014.

M. , A. Koonin, E. Roytberg, M. Gelfand, and M. , Computer analysis of transcription regulatory patterns in completely sequenced bacterial genomes, Nucleic Acids Research, vol.27, pp.2981-2989, 1999.

M. and K. , A codon substitution model that incorporates the effect of the GC contents, the gene density and the density of CpG islands of human chromosomes, BMC genomics, vol.12, p.397, 2011.

M. Jr, J. Pratt, and V. , , 1970.

M. , C. F. Arndt, P. F. Holm, L. Ellegren, and H. , Evolutionary consequences of DNA methylation on the GC content in vertebrate genomes, Genes, Genomes, Genetics, vol.3, pp.441-447, 2015.

M. and E. , OutLyzer: software for extracting low-allele-frequency tumor mutations from sequencing background noise in clinical practice, Oncotarget, vol.7, p.79485, 2016.

M. and G. , A fast bit-vector algorithm for approximate string matching based on dynamic programming, Journal of the ACM (JACM), vol.46, pp.395-415, 1999.

N. and K. , Sequence-specific error profile of Illumina sequencers, Nucleic acids research, vol.39, pp.90-90, 2011.

N. and G. , A guided tour to approximate string matching, ACM computing surveys (CSUR), vol.33, pp.31-88, 2001.

N. and G. , NR-grep: a fast and flexible pattern-matching tool. Software: Practice and Experience 31, pp.1265-1312, 2001.

N. , A. F. Liu, J. S. Lipman, D. J. Lawrence, and C. E. , Extracting protein alignment models from the sequence database, Nucleic Acids Research, vol.25, pp.1665-1677, 1997.

N. G. , H. Surani, and M. A. , The transcriptional and signalling networks of pluripotency, Nature cell biology, vol.13, p.490, 2011.

N. and S. , The life history of 21 breast cancers, Cell, vol.149, pp.994-1007, 2012.

N. , K. Frith, M. C. Nakai, and K. , Pseudocounts for transcription factor binding sites, Nucleic acids research, vol.37, pp.939-944, 2008.

O. , G. Lagrange, T. Reinberg, and D. , The general transcription factors of RNA polymerase II, Genes & development, vol.10, pp.2657-2683, 1996.

P. and P. J. , ChIP-seq: advantages and challenges of a maturing technology, Nature Reviews Genetics, vol.10, p.669, 2009.

P. , G. Mauri, G. Pesole, and G. , An algorithm for finding signals of unknown length in DNA sequences, Bioinformatics, vol.17, pp.207-221, 2001.

P. , G. Mereghetti, P. Mauri, G. Pesole, and G. , Weeder Web: discovery of transcription factor binding sites in a set of sequences from co-regulated genes, Nucleic acids research, vol.32, pp.199-203, 2004.

P. , C. E. Edamura, K. N. Cleary, and J. D. , Repeat instability: mechanisms of dynamic mutations, Nature Reviews Genetics, vol.6, p.729, 2005.

P. , P. A. Sze, and S. , Combinatorial approaches to finding subtle signals in DNA sequences, vol.8, pp.269-278, 2000.

P. and L. , DNA replication and causes of mutation, Nature education, vol.1, p.214, 2008.

Q. and W. , Analysis of TSC cortical tubers by deep sequencing of TSC1, TSC2 and KRAS demonstrates that small second-hit mutations in these genes are rare events, Brain pathology, vol.20, pp.1096-1105, 2010.

Q. and W. , Ultra deep sequencing detects a low rate of mosaic mutations in tuberous sclerosis complex, Human genetics, vol.127, pp.573-582, 2010.

Q. and M. A. , A tale of three next generation sequencing platforms: comparison of Ion Torrent, Pacific Biosciences and Illumina MiSeq sequencers, BMC genomics, vol.13, p.341, 2012.

Q. , C. Wegman, M. N. Korn, and L. J. , Improvements to a program for DNA analysin: a procedure to find homologies among many sequences, Nucleic acids research, vol.10, pp.449-456, 1982.

R. and R. , Global variation in copy number in the human genome, nature, vol.444, p.444, 2006.

R. and S. , Computational approaches to identify promoters and cisregulatory elements in plant genomes, Plant physiology, vol.132, pp.1162-1176, 2003.

S. and C. , DiNAMO: highly sensitive DNA motif discovery in high-throughput sequencing data, BMC bioinformatics, vol.19, p.223, 2018.

S. , M. Viari, and A. , A double combinatorial approach to discovering patterns in biological sequences in, Annual Symposium on Combinatorial Pattern Matching, pp.186-208, 1996.
URL : https://hal.archives-ouvertes.fr/hal-00435048

S. and S. , Evaluating variant calling tools for non-matched next-generation sequencing data, Scientific reports, vol.7, p.43169, 2017.

S. and G. K. , Potentials and limitations of motif-based binding site prediction in DNA thèse de doct, 2008.

S. , G. K. Drabløs, and F. , A survey of motif discovery methods in an integrated framework, Biology direct, vol.1, p.11, 2006.

S. , F. Nicklen, S. Coulson, and A. R. , DNA sequencing with chain-terminating inhibitors, Proceedings of the national academy of sciences, vol.74, pp.5463-5467, 1977.

S. and S. , Compound Poisson approximation of word counts in DNA sequences, ESAIM: probability and statistics, vol.1, pp.1-16, 1997.

S. and M. , Insight into biases and sequencing errors for amplicon sequencing with the Illumina MiSeq platform, Nucleic acids research, vol.43, pp.37-37, 2015.

T. D. Schneider and R. Stephens, Sequence logos: a new way to display consensus sequences, Nucleic Acids Research, vol.18, pp.6097-6100, 1990.

S. and D. B. , Linguistic approaches to biological sequences, Bioinformatics, vol.13, pp.333-344, 1997.

S. and D. B. , String variable grammar: A logic grammar formalism for the biological language of DNA, The Journal of Logic Programming, vol.24, pp.73-102, 1995.

S. and D. B. , The computational linguistics of biological sequences, Artificial intelligence and molecular biology, vol.2, pp.47-120, 1993.

S. , D. B. Seidl, and H. , Ultraviolet exposure as the main initiator of p53 mutations in basal cell carcinomas from psoralen and ultraviolet A-treated patients with psoriasis, Journal of Investigative Dermatology, vol.80, pp.365-370, 1992.

P. H. Sellers, The theory and computation of evolutionary distances: pattern recognition, Journal of algorithms, vol.1, pp.359-373, 1980.

K. L. Shanahan, A systematic error in mass flow calorimetry demonstrated, Thermochimica acta, vol.387, pp.95-100, 2002.

S. and S. T. , dbSNP: the NCBI database of genetic variation, Nucleic acids research, vol.29, pp.308-311, 2001.

S. and J. , AMD, an automated motif discovery tool using stepwise refinement of gapped consensuses, PloS one, vol.6, p.24576, 2011.

S. , S. Park, and J. , Characterization of sequence-specific errors in various nextgeneration sequencing systems, Molecular BioSystems, vol.12, pp.914-922, 2016.

S. and Y. , An empirical Bayesian framework for somatic mutation detection from cancer genome sequencing data, Nucleic acids research, vol.41, pp.89-89, 2013.

S. and S. , On counting position weight matrix matches in a sequence, with application to discriminative motif finding, Bioinformatics, vol.22, pp.454-463, 2006.

S. Sinha and M. Tompa, A statistical method for finding transcription factor binding sites, vol.8, pp.344-354, 2000.

S. Sinha and M. Tompa, YMF: a program for discovery of novel transcription factor binding sites by statistical overrepresentation, Nucleic acids research, vol.31, pp.3586-3588, 2003.

S. , G. D. Schneider, T. D. Gold, L. Ehrenfeucht, and A. , Use of the 'Perceptron' algorithm to distinguish translational initiation sites in E. coli, Nucleic acids research, vol.10, pp.2997-3011, 1982.

S. and P. H. , An integrated map of structural variation in 2,504 human genomes, Nature, vol.526, p.75, 2015.

T. and M. R. , A global role for KLF1 in erythropoiesis revealed by ChIPseq in primary erythroid cells, Genome research, vol.20, pp.1052-1063, 2010.

. Bibliographie,

T. , A. Abecasis, G. R. Kang, and H. M. , Unified representation of genetic variants, Bioinformatics, vol.31, pp.2202-2204, 2015.

T. , L. D'aurizio, R. Magi, and A. , Detection of genomic structural variants from next-generation sequencing data, Frontiers in bioengineering and biotechnology, vol.3, p.92, 2015.

T. and G. , A Gibbs sampling method to detect overrepresented motifs in the upstream regions of coexpressed genes, Journal of Computational Biology, vol.9, pp.447-464, 2002.

T. , J. A. Cover, and T. M. , Elements of information theory, 2006.

T. and M. , RSAT peak-motifs: motif analysis in full-size ChIPseq datasets, Nucleic acids research, vol.40, pp.31-31, 2011.

T. , H. Robinson, J. T. Mesirov, and J. P. , Integrative Genomics Viewer (IGV): high-performance genomics data visualization and exploration, Briefings in bioinformatics, vol.14, pp.178-192, 2013.

I. P. Tomlinson, M. Novelli, and W. Bodmer, The mutation rate and cancer, Proceedings of the National Academy of Sciences, vol.93, pp.14800-14803, 1996.

T. and M. , An exact method for finding short motifs in sequences, with application to the ribosome binding site problem, ISMB 99, pp.262-271, 1999.

T. and M. , Assessing computational tools for the discovery of transcription factor binding sites, Nature biotechnology, vol.23, pp.137-144, 2005.

H. Van, J. André, B. Collado-vides, and J. , Extracting regulatory sites from the upstream region of yeast genes by computational analysis of oligonucleotide frequencies, Journal of molecular biology, vol.281, pp.827-842, 1998.

V. , A. Marsan, L. Sagot, and M. , Promoter sequences and algorithmical methods for identifying them, Research in Microbiology, vol.150, pp.779-799, 1999.
URL : https://hal.archives-ouvertes.fr/hal-00428461

W. and J. D. , Estimating genotype error rates from high-coverage nextgeneration sequence data, Genome research, vol.24, pp.1734-1739, 2014.

W. , J. D. Crick, and F. H. , Molecular structure of nucleic acids, Nature, vol.171, pp.737-738, 1953.

. Bibliographie-204 and P. Weiner, Linear pattern matching algorithms in Switching and Automata Theory, SWAT'08. IEEE Conference Record of 14th Annual Symposium on, pp.1-11, 1973.

W. , G. B. Kelly, J. A. Gaffney, and P. M. , Use of next-generation DNA sequencing to analyze genetic variants in rheumatic disease, Arthritis research & therapy, vol.16, p.490, 2014.

W. and A. , LoFreq: a sequence-quality aware, ultra-sensitive variant caller for uncovering cell-population heterogeneity from high-throughput sequencing datasets, Nucleic acids research, vol.40, pp.11189-11201, 2012.

X. and X. , Systematic discovery of regulatory motifs in human promoters and 3' UTRs by comparison of several mammals, Nature, vol.434, p.338, 2005.

X. U. and C. , A review of somatic single nucleotide variant calling algorithms for nextgeneration sequencing data, Computational and structural biotechnology journal, 2018.

Y. , X. Chockalingam, S. P. Aluru, and S. , A survey of error-correction methods for next-generation sequencing, Briefings in bioinformatics, vol.14, pp.56-66, 2012.

Y. , H. Huttley, and G. , Exploiting CpG hypermutability to identify phenotypically significant variation within human protein-coding genes, Genome biology and evolution, vol.3, pp.938-949, 2011.

Y. , S. Thyagarajan, and B. , Review of clinical next-generation sequencing. Archives of pathology & laboratory medicine 141, pp.1544-1557, 2017.

Z. and S. L. , On Student's 1908 Article "The Probable Error of a Mean, Journal of the American Statistical Association, vol.103, pp.1-7, 2008.

Z. , Z. Zhang, and F. , Sequence context analysis of 8.2 million single nucleotide polymorphisms in the human genome, Gene, vol.366, pp.316-324, 2006.

Z. , J. M. Samarov, D. Mcdaniel, J. Sen, S. K. Salit et al., Synthetic spike-in standards improve run-specific systematic error analysis for DNA and RNA sequencing, PloS one, vol.7, p.41356, 2012.

Z. and J. M. , Extensive sequencing of seven human genomes to characterize benchmark reference materials, Scientific data, vol.3, p.160025, 2016.

. Bibliographie,

Z. and J. M. , Integrating human sequence data sets provides a resource of benchmark SNP and indel genotype calls, Nature biotechnology, vol.32, p.246, 2014.

R. Figures,

A. F. , R. , H. &. , and G. ,

M. and A. ,

P. and G. Sparsepp-accessed, , pp.2017-2018

R. Core-team, RÉFÉRENCES WEB POUR LES FIGURES 165. SAMTOOLS. hts-specs: Specifications of SAM/BAM and related high-throughput sequencing file formats

T. F. Scientific, sequencing / next-generation-sequencing / ion-torrentnext-generation-sequencing-workflow/ion-torrent-next-generationsequencing-data-analysis-workflow/ion-torrent-suite-software