.. Applications-statistiques, 108 4.3.1 Estimation des moments, erreurs d'estimation et de prédiction, p.115

H. Histogramme-des, des longueurs de plus courtes branches et des profondeurs d'insertion, p.95

M. Bibliographie1, I. A. Abramowitz, and . Stegun, Handbook of Mathematical Functions with Formulas, Graphs, and Mathematical Tables, 1964.

D. Aldous and P. Shields, A diusion limit for a class of randomly-growing binary search trees. Probab. Theory Related Fields, p.509542, 1988.

J. S. Almeida, J. A. Carriço, A. Maretzek, P. A. Noble, and M. Fletcher, Analysis of genomic sequences by Chaos Game Representation, Bioinformatics, vol.17, issue.5, p.429437, 2001.
DOI : 10.1093/bioinformatics/17.5.429

V. Anh, K. S. Lau, and Z. Yu, Multifractal characterisation of complete genomes, Physica A, vol.301, pp.1-4351361, 2001.

Y. Baraud, S. Huet, and E. B. Laurent, Adaptive tests of linear hypotheses by model selection, Ann. Statist, vol.31, issue.1, pp.225251-0090, 2003.

M. T. Barlow, R. Pemantle, and E. A. Perkins, Diusion-limited aggregation on a tree, p.160, 1997.

B. Bercu, Weighted estimation and tracking for ARMAX models, SIAM J. Control Optimization, vol.33, issue.1, p.89106, 1995.
DOI : 10.1137/s0363012992221803

B. Bercu, On the convergence of moments in the almost sure central limit theorem for martingales with statistical applications, Stochastic Processes and their applications, p.157173, 2004.
DOI : 10.1016/j.spa.2002.10.001

B. Bercu and M. Duo, Moindres carrés pondérés et poursuite, Ann. Inst. Henri Poincaré, vol.28, issue.3, p.403430, 1992.

G. Blom and D. Thorburn, How many random digits are required until given sequences are obtained, Journal of Applied Probabilities, vol.19, p.518531, 1982.
DOI : 10.2307/3213511

G. A. Brosamler, An almost everywhere central limit theorem, Mathematical Proceedings of the Cambridge Philosophical Society, vol.104, issue.03, p.561574, 1988.
DOI : 10.1007/BF01404058

A. M. Campbell, J. Mràzek, and E. S. Karlin, Genome signature comparisons among prokaryote, plasmid, and mitochondrial DNA, Proc. Natl. Acad. Sci. USA, p.91849189, 1999.
DOI : 10.1046/j.1365-2958.1998.01008.x

P. Cénac, Almost sure properties of weighted vectorial martingales transforms with applications to prediction for linear regression models, Probab. Math. Statist. Acta Univ. Wratislav. No, vol.23, issue.1, pp.6176-0208, 2003.

P. Cénac, Test on the structure of biological sequences via chaos game representation, Stat. Appl. Genet. Mol. Biol, vol.4, issue.36, 2005.

P. Cénac, B. Chauvin, N. Pouyanne, and S. Ginouillac, Digital search trees and chaos game representation, 2006.

P. Cénac, G. Fayolle, and J. M. Lasgouttes, Dynamical systems in the analysis of biological sequences, 2004.

F. Chaâbane, Version forte du théorème de la limite centrale fonctionnel pour les martingales, C. R. Acad. Sci. Paris Sér. I Math, vol.323, issue.2, p.195198, 1996.

F. Chaâbane, INVARIANCE PRINCIPLES WITH LOGARITHMIC AVERAGING FOR MARTINGALES, Studia Scientiarum Mathematicarum Hungarica, vol.37, issue.1-2, p.2152, 2001.
DOI : 10.1556/SScMath.37.2001.1-2.2

F. Chaâbane and F. Maâouia, Th??or??mes limites avec poids pour les martingales vectorielles, ESAIM: Probability and Statistics, vol.4, p.137189, 2000.
DOI : 10.1051/ps:2000103

F. Chaâbane, F. Maâouia, and E. A. Touati, G??n??ralisation du th??or??me de la limite centrale presque-s??r pour les martingales vectorielles, Comptes Rendus de l'Acad??mie des Sciences - Series I - Mathematics, vol.326, issue.2, pp.229232-0764, 1998.
DOI : 10.1016/S0764-4442(97)89476-1

G. A. Churchill, Stochastic models for heterogeneous dna sequences, Bull. Math. Biol, vol.51, issue.1, p.7994, 1989.

P. J. Deschavanne, A. Giron, J. Vilain, G. Fagot, and B. Fertil, Genomic signature: characterization and classification of species assessed by chaos game representation of sequences, Molecular Biology and Evolution, vol.16, issue.10, p.13911399, 1999.
DOI : 10.1093/oxfordjournals.molbev.a026048

L. Devroye and R. Neininger, Random sux search trees. Random Structures Algorithms, p.357396, 2003.

M. Drmota, The variance of the height of digital search trees, Acta Informatica, vol.38, issue.4, p.261276, 2002.
DOI : 10.1007/s236-002-8034-5

M. Duo, Random Iterative Methods, 1997.

M. Duo, R. Senoussi, and E. A. Touati, Propriétés asymptotiques presque sûres de l'estimateur des moindres carrés d'un modèle autoregressif vectoriel, p.125, 1991.

M. Karoui, V. Biaudet, S. Schbath, and E. A. Gruss, Characteristics of chi distribution on several bacterial genomes, Research in Microbiology, vol.150, p.579587, 1999.

J. Ellson, E. Gansner, E. Koren, J. Koutsoos, S. Mocenigo et al., Graphviz -graph visualization software. http XGGwwwFgr—phvizForgG, 2005.

P. Erd®s and P. Révész, On the length of the longest head run, Topics in Information Theory, p.219228, 1975.

P. Erd®s and P. Révész, On the length of the longest head-run In Topics in information theory (Second Colloq, Colloq. Math. Soc. János Bolyai, vol.16, p.219228, 1975.

K. J. Falconer, Fractal Geometry : Mathematical Foundations and Applications, J. Wiley and sons, 1990.
DOI : 10.1002/0470013850

J. Fayolle, Compression de données sans perte et combinatoire analytique, 2006.

B. Fertil, M. Massin, S. Lespinats, C. Devic, P. Dumee et al., GENSTYLE: exploration and analysis of DNA sequences with genomic signature, Nucleic Acids Research, vol.33, issue.Web Server, p.512515, 2005.
DOI : 10.1093/nar/gki489

J. C. Fu, Bounds for reliability of large consecutive-k-out-of-n :f system, IEEE trans. Reliability, issue.35, p.316319, 1986.

J. C. Fu and M. V. Koutras, Distribution Theory of Runs: A Markov Chain Approach, Journal of the American Statistical Association, vol.11, issue.427, p.10501058, 1994.
DOI : 10.1214/aoms/1177731421

H. Gerber and S. Li, The occurence of sequence patterns in repeated experiments and hitting times in a markov chain, Stochastic Processes and their Applications, p.101108, 1981.

N. Goldman, Nucleotide, dinucleotide and trinucleotide frequencies explain patterns observed in chaos game representations of DNA sequences Adaptative Filtering Prediction and Control, Nucleic Acids Res. N.J, vol.21, issue.10, p.24872491, 1984.

L. Gordon, M. F. Schilling, and M. S. Waterman, An extreme value theory for long head runs. Probability Theory and rlated Fields, p.279287, 1986.

I. S. Gradshteyn and I. Ryzhik, Table of Integrals, Series, and Products, 1980.

L. Guo, Self-convergence of weighted least-squares with applications to stochastic adaptive control, IEEE Trans. Automatic Control, vol.41, issue.1, p.7989, 1996.

J. M. Gutiérrez, M. A. Rodríguez, and E. G. Abramson, Multifractal analysis of DNA sequences using a novel chaos-game representation, Physica A: Statistical Mechanics and its Applications, vol.300, issue.1-2, p.271284, 2001.
DOI : 10.1016/S0378-4371(01)00333-8

D. Hall and C. Heyde, Martingale Limit Theory and its Applications Academic press, 1980.

H. J. Jerey, Chaos Game Representation of gene structure, Nucleic Acid. Res, vol.18, p.21632170, 1990.

R. W. Jernigan and R. Baran, Pervasive properties of the genomic signature, BMC Genomics, vol.3, issue.23, 2002.

J. Josse, A. D. Kaiser, and E. A. Kornberg, Enzymatic synthesis of deoxyribonucleic acid. VIII. frequencies of nearest neighbor base sequences in deoxyribonucleic acid, J. Biol. Chem, vol.263, p.864875, 1961.

S. Karlin and C. Burge, Dinucleotide relative abundance extremes : a genomic signature, Trends Genet, vol.7, p.283290, 1995.

S. Karlin, I. Landunga, and B. E. Blaisdell, Heterogeneity of genomes: measures and values., Proceedings of the National Academy of Sciences, vol.91, issue.26, p.1283712841, 1994.
DOI : 10.1073/pnas.91.26.12837

S. Karlin and J. Mràzek, Compositional dierences within and between eukaryotic genomes, Proc. Natl. Acad. Sci. USA, p.1022710232, 1997.

S. Karlin and J. Mràzek, Strand compositional asymmetry in bacterial and large viral genomes, Proc. Natl. Acad. Sci. USA, p.37203725, 1998.

S. Karlin, J. Mràzek, and A. M. Campbell, Compositional biases of bacterial genomes and evolutionary implications., Journal of Bacteriology, vol.179, issue.12, p.38993913, 1997.
DOI : 10.1128/jb.179.12.3899-3913.1997

D. E. Knuth, The art of computer programming, Series in Computer Science and Information Processing, 1981.

M. V. Koutras, Waiting Times and Number of Appearances of Events in a Sequence of Discrete Random Variables, Advances in combinatorial methods and applications to probability and statistics, p.363384, 1997.
DOI : 10.1007/978-1-4612-4140-9_21

M. Lacey, Laws of the iterated logarithm for partial sum processes indexed by functions, Journal of Theoretical Probability, vol.1, issue.3, p.377398, 1989.
DOI : 10.1007/BF01054022

T. L. Lai and C. Z. Wei, Least-squares estimates in stochastic regression models with applications to identication and control of dynamic systems, The Annals of Statistics, vol.10, issue.1, p.154166, 1982.

T. L. Lai and C. Z. Wei, Asymptotic properties of general autoregressive models and strong consistency of least-squares estimates of their parameters, Journal of Multivariate Analysis, vol.13, issue.1, p.123, 1983.
DOI : 10.1016/0047-259X(83)90002-7

P. L. Ecuyer, Uniform random number generation, Ann. Oper. Res, vol.53, p.77120, 1994.

X. Leroy, D. Doligez, J. Garrigue, D. Rémy, and E. J. Vouillon, The Objective Caml system, documentation and user's manual. http XGG™—mlFinri—FfrG, 2005.

S. R. Li, A Martingale Approach to the Study of Occurrence of Sequence Patterns in Repeated Experiments, The Annals of Probability, vol.8, issue.6, p.11711176, 1980.
DOI : 10.1214/aop/1176994578

M. A. Lifshits, Lecture notes on almost sure limit theorems, Publications IRMA, vol.54, p.125, 2001.

M. A. Lifshits, Almost sure limit theorem for martingales, Limit theorems in probability and statistics, p.367390, 1999.

M. Loève, Probability theory. II, Graduate Texts in Mathematics, vol.46, 1978.
DOI : 10.1007/978-1-4612-6257-2

H. Mahmoud, Evolution of Random Search Trees, chapter 6, 1992.

H. M. Martinez, An ecient method for nding repeats in molecular sequences, p.46294634, 1983.

A. J. Menezes, P. C. Van-oorschot, and E. S. Vanstone, Handbook of applied cryptography, Series on Discrete Mathematics and its Applications, 1997.
DOI : 10.1201/9781439821916

S. P. Meyn and R. L. Tweedie, Markov chains and stochastic stability [70] F. Muri. Modelling bacterial genomes using hidden markov models, Compstat'98 Proceedings in Computational Statistics, p.89100, 1993.

W. Penney, Problem : Penney-ante, J. Recreational Math, vol.2, p.241, 1969.

G. Perrière and M. Gouy, WWW-query: An on-line retrieval system for biological sequence banks, Biochimie, vol.78, issue.5, p.364369, 1996.
DOI : 10.1016/0300-9084(96)84768-7

V. Petrov, On the Probabilities of Large Deviations for Sums of Independent Random Variables, Theory of Probability & Its Applications, vol.10, issue.2, p.287298, 1965.
DOI : 10.1137/1110033

V. Pozdnyakov, J. Glaz, M. Kulldor, and J. M. Steele, A martingale approach to scan statistics, Annals of the Institute of Statistical Mathematics, vol.100, issue.1, p.2137, 2005.
DOI : 10.1007/BF02506876

M. Régnier, A unied approach to word occurence probabilities, Discrete Applied Mathematics, vol.104, p.259280, 2000.

G. Reinert, S. Schbath, and M. S. Waterman, Probabilistic and Statistical Properties of Words: An Overview, Journal of Computational Biology, vol.7, issue.1-2, p.146, 2000.
DOI : 10.1089/10665270050081360

S. Robin and J. J. Daudin, Exact distribution of word occurences in a random sequence of letters, J. Appl. Prob, vol.36, p.179193, 1999.

E. Rocha, A. Viari, and E. A. Danchin, Oligonucleotide bias in Bacillus subtilis: General trends and taxonomic comparisons, Nucleic Acids Research, vol.26, issue.12, p.29712980, 1998.
DOI : 10.1093/nar/26.12.2971

A. Roy, C. Raychaudhury, and E. A. Nandy, Novel techniques of graphical representation and analysis of DNA sequences???A review, Journal of Biosciences, vol.19, issue.1, p.5571, 1998.
DOI : 10.1007/BF02728525

G. J. Russel and J. H. Subak-sharpe, Similarity of the general designs of protochordates and invertebrates, Nature, vol.11, issue.5602, p.533535, 1977.
DOI : 10.1016/0005-2787(66)90373-X

N. Saitou and M. Nei, The neighbor-joining method : A new method for reconstructing phylogenetic trees, Mol. Biol. Evol, vol.4, issue.4, p.406425, 1987.

S. S. Samarova, On the length of the longest head-run for a markov chain with two states. Theory of probability and its applications, p.498509, 1981.

P. Schatte, On Strong Versions of the Central Limit Theorem, Mathematische Nachrichten, vol.77, issue.1, p.249256, 1988.
DOI : 10.1002/mana.19881370117

P. Schatte, On almost sure convergence of subsequences in the central limit theorem, Statistics, vol.24, issue.4, p.237246, 1990.
DOI : 10.1007/BF00334035

D. Stark, First Occurrence in Pairs of Long Words: A Penney-ante Conjecture of Pevzner, Combinatorics, Probability and Computing, vol.2, issue.03, p.279285, 1995.
DOI : 10.1016/0097-3165(81)90005-4

V. Stefanov and A. G. Pakes, Explicit distributional results in pattern formation, The Annals of Applied Probability, vol.7, issue.3, p.666678, 1997.
DOI : 10.1214/aoap/1034801248

A. W. Van and . Vaart, Asymptotic Statistics, 1998.

C. Z. Wei and J. Winnicki, Estimation of the Means in the Branching Process with Immigration, The Annals of Statistics, vol.18, issue.4, p.17571773, 1990.
DOI : 10.1214/aos/1176347876

C. Z. Wei, Asymptotic properties of least-squares estimates in stochastic regression models. The Annals of Statistics, p.14981508, 1985.

C. Z. Wei, Adaptative prediction by least squares predictors in stochastic regression models with applications to time series. The Annals of Statistics, p.16671682, 1987.

P. Weiner, Linear pattern matching algorithms, 14th Annual Symposium on Switching and Automata Theory (swat 1973), p.111, 1973.
DOI : 10.1109/SWAT.1973.13

D. Williams, Probability with martingales. Cambridge Mathematical Textbooks, 1991.

J. Ziv and A. Lempel, A universal algorithm for sequential data compression, IEEE Transactions on Information Theory, vol.23, issue.3, p.337343, 1977.
DOI : 10.1109/TIT.1977.1055714