, Scientific Data Mining and Knowledge Discovery, pp.207-247, 2009.
SeAMotE: a method for high-throughput motif discovery in nucleic acid sequences, BMC genomics, vol.15, p.925, 2014. ,
Genome structural variation discovery and genotyping, Nature Reviews Genetics, vol.12, p.363, 2011. ,
Discovering motifs that induce sequencing errors in BMC bioinformatics, vol.14, p.1, 2013. ,
Principes de biologie moléculaire en biologie clinique ISBN : 9782842996857, 2005. ,
CpG dinucleotides and human disorders, eLS, 2006. ,
Faster approximate string matching, Algorithmica, vol.23, pp.127-158, 1999. ,
DREME: motif discovery in transcription factor ChIP-seq data, Bioinformatics, vol.27, pp.1653-1659, 2011. ,
MEME SUITE: tools for motif discovery and searching, Nucleic acids research, p.335, 2009. ,
Lynch syndrome: Towards a multidisciplinary management of tumour screening. Gynecologie, obstetrique & fertilite, vol.39, p.272, 2011. ,
A maximization technique occurring in the statistical analysis of probabilistic functions of Markov chains. The annals of mathematical statistics 41, pp.164-171, 1970. ,
Next generation sequencing of serum circulating nucleic acids from patients with invasive ductal breast cancer reveals differences to healthy and nonmalignant controls, Molecular cancer research, vol.8, pp.335-342, 2010. ,
Module 4: Sensors and Effectors, Cell Signalling Biology, vol.6, pp.1749-7787, 2014. ,
pyAmpli: an amplicon-based variant filter pipeline for targeted resequencing data, BMC bioinformatics, vol.18, p.554, 2017. ,
DNA methylation and the frequency of CpG in animal DNA, Nucleic Acids Research, vol.8, pp.1499-1504, 1980. ,
A fast string searching algorithm, Communications of the ACM, vol.20, pp.762-772, 1977. ,
Algorithms for minimization without derivatives (Courier Corporation, 2013. ,
Evaluation of gene structure prediction programs, genomics, vol.34, pp.353-367, 1996. ,
Detecting Low-Fraction Variants in Low-Quality Cancer Samples from Whole-exome Sequencing Data, p.43612, 2016. ,
MATRIX SEARCH 1.0: a computer program that scans DNA sequences for transcriptional elements using a database of weight matrices, Bioinformatics, vol.11, pp.563-566, 1995. ,
Integration of external signaling pathways with the core transcriptional network in embryonic stem cells, Cell, vol.133, pp.1106-1117, 2008. ,
Assessing single nucleotide variant detection and genotype calling on whole-genome sequenced individuals. Bioinformatics, vol.30, pp.1707-1713, 2014. ,
Molecular testing for BRAF mutations to inform melanoma treatment decisions: a move toward precision medicine, Modern Pathology, vol.31, p.24, 2018. ,
Erythroid GATA1 function revealed by genome-wide analysis of transcription factor occupancy, histone modifications, and mRNA expression ,
, Genome research, vol.19, pp.2172-2184, 2009.
Non-invasive prenatal assessment of trisomy 21 by multiplexed maternal plasma DNA sequencing: large scale validity study, Bmj, vol.342, p.7401, 2011. ,
, Trad. fr.: Le Seuil, 1957.
Sensitive detection of somatic point mutations in impure and heterogeneous cancer samples, Nature biotechnology, vol.31, p.213, 2013. ,
Genetic mutation, Nature Education, vol.1, p.187, 2008. ,
Single nucleotide polymorphism identification in polyploids: a review, example, and recommendations, Molecular plant, vol.8, pp.831-846, 2015. ,
A map of human genome variation from populationscale sequencing, Nature, vol.467, p.1061, 2010. ,
Initial sequencing and analysis of the human genome, Nature, vol.409, p.860, 2001. ,
VHL mosaicism can be detected by clinical next-generation sequencing and is not restricted to patients with a mild phenotype, European Journal of Human Genetics, vol.22, p.1149, 2014. ,
, , 1994.
, , 2001.
,
Hidden markov models, Current opinion in structural biology, vol.6, pp.361-365, 1996. ,
Profile hidden Markov models, Bioinformatics, vol.14, pp.755-763, 1998. ,
Gene Expression Omnibus: NCBI gene expression and hybridization array data repository, Nucleic acids research, vol.30, pp.207-210, 2002. ,
A universal framework for regulatory element discovery across all genomes and data types, Molecular cell, vol.28, pp.337-350, 2007. ,
Noninvasive diagnosis of fetal aneuploidy by shotgun sequencing DNA from maternal blood, Proceedings of the National Academy of Sciences, vol.105, pp.16266-16271, 2008. ,
Indexing compressed text, Journal of the ACM (JACM), vol.52, pp.552-581, 2005. ,
Structural variation in the human genome, Nature Reviews Genetics, vol.7, p.85, 2006. ,
On the interpretation of ? 2 from contingency tables, and the calculation of P, Journal of the Royal Statistical Society, vol.85, pp.87-94, 1922. ,
Many mosaic mutations, Current Oncology, vol.20, p.85, 2013. ,
Accuracy of next generation sequencing platforms. Next generation, sequencing & applications, p.1, 2014. ,
Identification of common motifs in unaligned DNA sequences: application to Escherichia coli Lrp regulon, Bioinformatics, vol.11, pp.379-387, 1995. ,
Hereditary nonpolyposis colorectal cancer. Definition, genetics, diagnosis, and medical surveillance, Gastroenterologie clinique et biologique, vol.27, p.708, 2003. ,
Haplotype-based variant detection from short-read sequencing, 2012. ,
An overview of chemical processes that damage cellular DNA: spontaneous hydrolysis, alkylation, and reactions with radicals, Chemical research in toxicology, vol.22, pp.1747-1760, 2009. ,
Reliable detection of subclonal single-nucleotide variants in tumour cell populations, Nature communications, vol.3, p.811, 2012. ,
Oncogènes et leucémies: historique et perspectives. médecine/sciences 19, pp.201-210, 2003. ,
Field guide to next-generation DNA sequencers, Molecular ecology resources, vol.11, pp.759-769, 2011. ,
An approximation to the distribution of finite sample size mutual information estimates in Communications, IEEE, vol.2, pp.1102-1106, 2005. ,
Coming of age: ten years of next-generation sequencing technologies, Nature Reviews Genetics, vol.17, p.333, 2016. ,
Spontaneous mutations, An Introduction to Genetic Analysis, 2000. ,
Introduction à l'analyse génétique ISBN : 9782744500978, 2002. ,
Meta-MEME: motif-based hidden Markov models of protein families, Bioinformatics, vol.13, pp.397-406, 1997. ,
MERIT reveals the impact of genomic context on sequencing error rate in ultra-deep applications, BMC Bioinformatics, vol.19, p.219, 2018. ,
DNA-damage repair; the good, the bad, and the ugly, The EMBO journal, vol.27, pp.589-605, 2008. ,
Transcriptional regulatory code of a eukaryotic genome, Nature, vol.431, p.99, 2004. ,
,
Simple combinations of lineage-determining transcription factors prime cis-regulatory elements required for macrophage and B cell identities, Molecular cell, vol.38, pp.576-589, 2010. ,
Discovering regulatory elements in non-coding sequences by analysis of spaced dyads. Nucleic acids research 28, pp.1808-1818, 2000. ,
URL : https://hal.archives-ouvertes.fr/hal-01624381
OMICtools: an informative directory for multi-omic data analysis, Database, 2014. ,
URL : https://hal.archives-ouvertes.fr/inserm-01026133
DNA methylation and mutation. Mutation Research/ Fundamental and Molecular Mechanisms of, Mutagenesis, vol.285, pp.61-67, 1993. ,
A Simple Sequentially Rejective Multiple Test Procedure, Scandinavian Journal of Statistics, vol.6, pp.65-70, 1979. ,
Computational identification of cis-regulatory elements associated with groups of functionally related genes in Saccharomyces cerevisiae, Journal of molecular biology, vol.296, pp.1205-1214, 2000. ,
Distribution of mutual information in Advances in neural information processing systems, pp.399-406, 2002. ,
Systematic comparison of variant calling pipelines using gold standard personal exome variants, Scientific reports, vol.5, p.17875, 2015. ,
De novo assembly and genotyping of variants using colored de Bruijn graphs, Nature genetics, vol.44, p.226, 2012. ,
Controlled 15-year trial on screening for colorectal cancer in families with hereditary nonpolyposis colorectal cancer, Gastroenterology, vol.118, pp.829-834, 2000. ,
Efficient discovery of conserved patterns using a pattern graph, Bioinformatics, vol.13, pp.509-522, 1997. ,
Bias and artifacts in multitemplate polymerase chain reactions (PCR), Journal of bioscience and bioengineering, vol.96, pp.317-323, 2003. ,
, , p.9782807308015, 2018.
Addressing NGS Data Challenges: Efficient High Throughput Processing and Sequencing Error Detection thèse de doct, 2015. ,
A restriction enzyme from Hemophilus influenzae: II. Base sequence of the recognition site, Journal of molecular biology, vol.51, pp.393-409, 1970. ,
?-Synucleinopathy associated with G51D SNCA mutation: A link between Parkinson's disease and multiple system atrophy? Acta neuropathologica 125, pp.753-769, 2013. ,
Signaling through the JAK/STAT pathway, recent advances and future challenges, Gene, vol.285, pp.1-24, 2002. ,
Computation of the noncentral gamma distribution, SIAM Journal on Scientific Computing, vol.17, pp.1224-1231, 1996. ,
Fast pattern matching in strings, SIAM journal on computing, vol.6, pp.323-350, 1977. ,
DNA-binding specificities of the GATA transcription factor family, Molecular and cellular biology, vol.13, pp.4011-4022, 1993. ,
VarScan 2: somatic mutation and copy number alteration discovery in cancer by exome sequencing, Genome research, vol.22, pp.568-576, 2012. ,
The Interlaboratory RObustness of Next-generation sequencing (IRON) study: a deep sequencing investigation of TET2, CBL and KRAS mutations by an international consortium involving 10 laboratories, Leukemia, vol.25, p.1840, 2011. ,
, Handbook of Open Source Tools, pp.127-143, 2011.
,
Hidden Markov models in computational biology: Applications to protein modeling, Journal of molecular biology, vol.235, pp.1501-1531, 1994. ,
Approximate string searching under weighted edit distance in Proc, WSP, vol.96, pp.156-170, 1996. ,
Fast parallel and serial approximate string matching, Journal of algorithms, vol.10, pp.157-169, 1989. ,
Fast gapped-read alignment with Bowtie 2, Nature methods, vol.9, p.357, 2012. ,
Biochimie-Tout le cours en fiches-2e éd: 200 fiches de cours, 155 QCM, sujets de synthèse et ressources en ligne ISBN : 9782100765959, 2017. ,
Detecting subtle sequence signals: a Gibbs sampling strategy for multiple alignment. SCIENCE-NEW YORK THEN WASHINGTON-262, pp.208-208, 1993. ,
Ultra-deep sequencing of foraminiferal microbarcodes unveils hidden richness of early monothalamous lineages in deep-sea sediments, Proceedings of the National Academy of Sciences, vol.108, pp.13177-13182, 2011. ,
The sequence read archive, Nucleic acids research, vol.39, pp.19-21, 2010. ,
Toward better understanding of artifacts in variant calling from highcoverage samples, Bioinformatics, vol.30, pp.2843-2851, 2014. ,
Fast and accurate short read alignment with BurrowsWheeler transform, Bioinformatics, vol.25, pp.1754-1760, 2009. ,
Bayesian models for multiple local sequence alignment and Gibbs sampling strategies, Journal of the American Statistical Association, vol.90, pp.1156-1170, 1995. ,
Variant callers for next-generation sequencing data: a comparison study, PloS one, vol.8, p.75619, 2013. ,
BioProspector: discovering conserved DNA motifs in upstream regulatory regions of co-expressed genes, Pacific symposium on biocomputing, vol.6, pp.127-138, 2001. ,
Discriminative pattern mining and its applications in bioinformatics, Briefings in bioinformatics, vol.16, pp.884-900, 2014. ,
Performance comparison of benchtop high-throughput sequencing platforms, Nature biotechnology, vol.30, p.434, 2012. ,
Discriminative learning for probabilistic sequence analysis thèse de doct. (Dept. of Computational Molecular Biology (Head: Martin Vingron), Max Planck Institute for Molecular Genetics, 2015. ,
Binding site discovery from nucleic acid sequences by discriminative learning of Hidden Markov Models. Nucleic acids research 42, pp.12995-13011, 2014. ,
MEME-ChIP: motif analysis of large DNA datasets, Bioinformatics, vol.27, pp.1696-1697, 2011. ,
Metagenomic analysis of a permafrost microbial community reveals a rapid response to thaw, Nature, vol.480, p.368, 2011. ,
Suffix arrays: a new method for on-line string searches, siam Journal on Computing, vol.22, pp.935-948, 1993. ,
Extraction de Motifs Communs dans un Ensemble de Séquences ,
, Application à l'identification de sites de liaison aux protéines dans les séquences primaires d'ADN. thèse de doct, 2006.
Efficient exact motif discovery, Bioinformatics, vol.25, pp.356-364, 2009. ,
JASPAR 2016: a major expansion and update of the openaccess database of transcription factor binding profiles, Nucleic acids research, vol.44, pp.110-115, 2016. ,
A space-economical suffix tree construction algorithm, Journal of the ACM (JACM), vol.23, pp.262-272, 1976. ,
Identification and correction of systematic error in highthroughput sequence data, BMC bioinformatics, vol.12, p.451, 2011. ,
, RSAT 2015: regulatory sequence analysis tools. Nucleic acids research, p.362, 2015.
SciClone: inferring clonal architecture and tracking the spatial and temporal patterns of tumor evolution, PLoS computational biology, vol.10, p.1003665, 2014. ,
Computer analysis of transcription regulatory patterns in completely sequenced bacterial genomes, Nucleic Acids Research, vol.27, pp.2981-2989, 1999. ,
A codon substitution model that incorporates the effect of the GC contents, the gene density and the density of CpG islands of human chromosomes, BMC genomics, vol.12, p.397, 2011. ,
, , 1970.
Evolutionary consequences of DNA methylation on the GC content in vertebrate genomes, Genes, Genomes, Genetics, vol.3, pp.441-447, 2015. ,
OutLyzer: software for extracting low-allele-frequency tumor mutations from sequencing background noise in clinical practice, Oncotarget, vol.7, p.79485, 2016. ,
A fast bit-vector algorithm for approximate string matching based on dynamic programming, Journal of the ACM (JACM), vol.46, pp.395-415, 1999. ,
Sequence-specific error profile of Illumina sequencers, Nucleic acids research, vol.39, pp.90-90, 2011. ,
A guided tour to approximate string matching, ACM computing surveys (CSUR), vol.33, pp.31-88, 2001. ,
NR-grep: a fast and flexible pattern-matching tool. Software: Practice and Experience 31, pp.1265-1312, 2001. ,
Extracting protein alignment models from the sequence database, Nucleic Acids Research, vol.25, pp.1665-1677, 1997. ,
The transcriptional and signalling networks of pluripotency, Nature cell biology, vol.13, p.490, 2011. ,
The life history of 21 breast cancers, Cell, vol.149, pp.994-1007, 2012. ,
Pseudocounts for transcription factor binding sites, Nucleic acids research, vol.37, pp.939-944, 2008. ,
The general transcription factors of RNA polymerase II, Genes & development, vol.10, pp.2657-2683, 1996. ,
ChIP-seq: advantages and challenges of a maturing technology, Nature Reviews Genetics, vol.10, p.669, 2009. ,
An algorithm for finding signals of unknown length in DNA sequences, Bioinformatics, vol.17, pp.207-221, 2001. ,
Weeder Web: discovery of transcription factor binding sites in a set of sequences from co-regulated genes, Nucleic acids research, vol.32, pp.199-203, 2004. ,
Repeat instability: mechanisms of dynamic mutations, Nature Reviews Genetics, vol.6, p.729, 2005. ,
Combinatorial approaches to finding subtle signals in DNA sequences, vol.8, pp.269-278, 2000. ,
DNA replication and causes of mutation, Nature education, vol.1, p.214, 2008. ,
Analysis of TSC cortical tubers by deep sequencing of TSC1, TSC2 and KRAS demonstrates that small second-hit mutations in these genes are rare events, Brain pathology, vol.20, pp.1096-1105, 2010. ,
Ultra deep sequencing detects a low rate of mosaic mutations in tuberous sclerosis complex, Human genetics, vol.127, pp.573-582, 2010. ,
A tale of three next generation sequencing platforms: comparison of Ion Torrent, Pacific Biosciences and Illumina MiSeq sequencers, BMC genomics, vol.13, p.341, 2012. ,
Improvements to a program for DNA analysin: a procedure to find homologies among many sequences, Nucleic acids research, vol.10, pp.449-456, 1982. ,
Global variation in copy number in the human genome, nature, vol.444, p.444, 2006. ,
Computational approaches to identify promoters and cisregulatory elements in plant genomes, Plant physiology, vol.132, pp.1162-1176, 2003. ,
DiNAMO: highly sensitive DNA motif discovery in high-throughput sequencing data, BMC bioinformatics, vol.19, p.223, 2018. ,
A double combinatorial approach to discovering patterns in biological sequences in, Annual Symposium on Combinatorial Pattern Matching, pp.186-208, 1996. ,
URL : https://hal.archives-ouvertes.fr/hal-00435048
Evaluating variant calling tools for non-matched next-generation sequencing data, Scientific reports, vol.7, p.43169, 2017. ,
Potentials and limitations of motif-based binding site prediction in DNA thèse de doct, 2008. ,
A survey of motif discovery methods in an integrated framework, Biology direct, vol.1, p.11, 2006. ,
DNA sequencing with chain-terminating inhibitors, Proceedings of the national academy of sciences, vol.74, pp.5463-5467, 1977. ,
Compound Poisson approximation of word counts in DNA sequences, ESAIM: probability and statistics, vol.1, pp.1-16, 1997. ,
Insight into biases and sequencing errors for amplicon sequencing with the Illumina MiSeq platform, Nucleic acids research, vol.43, pp.37-37, 2015. ,
Sequence logos: a new way to display consensus sequences, Nucleic Acids Research, vol.18, pp.6097-6100, 1990. ,
Linguistic approaches to biological sequences, Bioinformatics, vol.13, pp.333-344, 1997. ,
String variable grammar: A logic grammar formalism for the biological language of DNA, The Journal of Logic Programming, vol.24, pp.73-102, 1995. ,
The computational linguistics of biological sequences, Artificial intelligence and molecular biology, vol.2, pp.47-120, 1993. ,
Ultraviolet exposure as the main initiator of p53 mutations in basal cell carcinomas from psoralen and ultraviolet A-treated patients with psoriasis, Journal of Investigative Dermatology, vol.80, pp.365-370, 1992. ,
The theory and computation of evolutionary distances: pattern recognition, Journal of algorithms, vol.1, pp.359-373, 1980. ,
A systematic error in mass flow calorimetry demonstrated, Thermochimica acta, vol.387, pp.95-100, 2002. ,
dbSNP: the NCBI database of genetic variation, Nucleic acids research, vol.29, pp.308-311, 2001. ,
AMD, an automated motif discovery tool using stepwise refinement of gapped consensuses, PloS one, vol.6, p.24576, 2011. ,
Characterization of sequence-specific errors in various nextgeneration sequencing systems, Molecular BioSystems, vol.12, pp.914-922, 2016. ,
An empirical Bayesian framework for somatic mutation detection from cancer genome sequencing data, Nucleic acids research, vol.41, pp.89-89, 2013. ,
On counting position weight matrix matches in a sequence, with application to discriminative motif finding, Bioinformatics, vol.22, pp.454-463, 2006. ,
A statistical method for finding transcription factor binding sites, vol.8, pp.344-354, 2000. ,
YMF: a program for discovery of novel transcription factor binding sites by statistical overrepresentation, Nucleic acids research, vol.31, pp.3586-3588, 2003. ,
Use of the 'Perceptron' algorithm to distinguish translational initiation sites in E. coli, Nucleic acids research, vol.10, pp.2997-3011, 1982. ,
An integrated map of structural variation in 2,504 human genomes, Nature, vol.526, p.75, 2015. ,
A global role for KLF1 in erythropoiesis revealed by ChIPseq in primary erythroid cells, Genome research, vol.20, pp.1052-1063, 2010. ,
,
Unified representation of genetic variants, Bioinformatics, vol.31, pp.2202-2204, 2015. ,
Detection of genomic structural variants from next-generation sequencing data, Frontiers in bioengineering and biotechnology, vol.3, p.92, 2015. ,
A Gibbs sampling method to detect overrepresented motifs in the upstream regions of coexpressed genes, Journal of Computational Biology, vol.9, pp.447-464, 2002. ,
Elements of information theory, 2006. ,
RSAT peak-motifs: motif analysis in full-size ChIPseq datasets, Nucleic acids research, vol.40, pp.31-31, 2011. ,
Integrative Genomics Viewer (IGV): high-performance genomics data visualization and exploration, Briefings in bioinformatics, vol.14, pp.178-192, 2013. ,
The mutation rate and cancer, Proceedings of the National Academy of Sciences, vol.93, pp.14800-14803, 1996. ,
An exact method for finding short motifs in sequences, with application to the ribosome binding site problem, ISMB 99, pp.262-271, 1999. ,
Assessing computational tools for the discovery of transcription factor binding sites, Nature biotechnology, vol.23, pp.137-144, 2005. ,
Extracting regulatory sites from the upstream region of yeast genes by computational analysis of oligonucleotide frequencies, Journal of molecular biology, vol.281, pp.827-842, 1998. ,
Promoter sequences and algorithmical methods for identifying them, Research in Microbiology, vol.150, pp.779-799, 1999. ,
URL : https://hal.archives-ouvertes.fr/hal-00428461
Estimating genotype error rates from high-coverage nextgeneration sequence data, Genome research, vol.24, pp.1734-1739, 2014. ,
Molecular structure of nucleic acids, Nature, vol.171, pp.737-738, 1953. ,
Linear pattern matching algorithms in Switching and Automata Theory, SWAT'08. IEEE Conference Record of 14th Annual Symposium on, pp.1-11, 1973. ,
Use of next-generation DNA sequencing to analyze genetic variants in rheumatic disease, Arthritis research & therapy, vol.16, p.490, 2014. ,
LoFreq: a sequence-quality aware, ultra-sensitive variant caller for uncovering cell-population heterogeneity from high-throughput sequencing datasets, Nucleic acids research, vol.40, pp.11189-11201, 2012. ,
Systematic discovery of regulatory motifs in human promoters and 3' UTRs by comparison of several mammals, Nature, vol.434, p.338, 2005. ,
A review of somatic single nucleotide variant calling algorithms for nextgeneration sequencing data, Computational and structural biotechnology journal, 2018. ,
A survey of error-correction methods for next-generation sequencing, Briefings in bioinformatics, vol.14, pp.56-66, 2012. ,
Exploiting CpG hypermutability to identify phenotypically significant variation within human protein-coding genes, Genome biology and evolution, vol.3, pp.938-949, 2011. ,
Review of clinical next-generation sequencing. Archives of pathology & laboratory medicine 141, pp.1544-1557, 2017. ,
On Student's 1908 Article "The Probable Error of a Mean, Journal of the American Statistical Association, vol.103, pp.1-7, 2008. ,
Sequence context analysis of 8.2 million single nucleotide polymorphisms in the human genome, Gene, vol.366, pp.316-324, 2006. ,
Synthetic spike-in standards improve run-specific systematic error analysis for DNA and RNA sequencing, PloS one, vol.7, p.41356, 2012. ,
Extensive sequencing of seven human genomes to characterize benchmark reference materials, Scientific data, vol.3, p.160025, 2016. ,
,
Integrating human sequence data sets provides a resource of benchmark SNP and indel genotype calls, Nature biotechnology, vol.32, p.246, 2014. ,
,
,
,
, , pp.2017-2018
, RÉFÉRENCES WEB POUR LES FIGURES 165. SAMTOOLS. hts-specs: Specifications of SAM/BAM and related high-throughput sequencing file formats
sequencing / next-generation-sequencing / ion-torrentnext-generation-sequencing-workflow/ion-torrent-next-generationsequencing-data-analysis-workflow/ion-torrent-suite-software ,