J. Adachi and M. Hasegawa, MOLPHY version 2.3. programs for molecular phylogenetics based on maximum likelihood, Computer Science Monographs, 1996.

A. Agresti, Categorical data analysis Wiley Series in Probability and Mathematical Statistics : Applied Probability and Statistics, 1990.

A. Agresti, An introduction to categorical data analysis Wiley Series in Probability and Statistics, 2007.

A. Albert, Estimating the Infinitesimal Generator of a Continuous Time, Finite State Markov Process, The Annals of Mathematical Statistics, vol.33, issue.2, pp.727-753, 1962.
DOI : 10.1214/aoms/1177704594

F. Antequera and A. Bird, CpG islands as genomic footprints of promoters that are associated with replication origins, Current Biology, vol.9, issue.17, pp.661-667, 1999.
DOI : 10.1016/S0960-9822(99)80418-7

P. F. Arndt, C. B. Burge, and T. Hwa, DNA Sequence Evolution with Neighbor-Dependent Mutation, Journal of Computational Biology, vol.10, issue.3-4, pp.313-322, 2003.
DOI : 10.1089/10665270360688039

URL : http://arxiv.org/abs/physics/0112029

P. F. Arndt and T. Hwa, Identification and measurement of neighbor-dependent nucleotide substitution processes, Bioinformatics, vol.21, issue.10, pp.2322-2328, 2005.
DOI : 10.1093/bioinformatics/bti376

L. E. Baum and T. Petrie, Statistical Inference for Probabilistic Functions of Finite State Markov Chains, The Annals of Mathematical Statistics, vol.37, issue.6, pp.1554-1563, 1966.
DOI : 10.1214/aoms/1177699147

L. E. Baum, T. Petrie, G. Soules, and N. Weiss, A Maximization Technique Occurring in the Statistical Analysis of Probabilistic Functions of Markov Chains, The Annals of Mathematical Statistics, vol.41, issue.1, pp.164-171, 1970.
DOI : 10.1214/aoms/1177697196

G. Bernardi, B. Olofsson, J. Filipski, M. Zerial, J. Salinas et al., The mosaic genome of warm-blooded vertebrates, Science, vol.228, issue.4702, pp.953-958, 1985.
DOI : 10.1126/science.4001930

J. Besag, Statistical analysis of non-lattice data. The Statistician, pp.179-195, 1975.

M. W. Birch, A New Proof of the Pearson-Fisher Theorem, The Annals of Mathematical Statistics, vol.35, issue.2, pp.817-824, 1964.
DOI : 10.1214/aoms/1177703581

J. Bérard, J. Gouéré, and D. Piau, Solvable models of neighbor-dependent substitution processes, Mathematical Biosciences, vol.211, issue.1, pp.56-88, 2008.
DOI : 10.1016/j.mbs.2007.10.001

M. Bulmer, Neighboring base effects on substitution rates in pseudogenes, Mol. Biol. Evol, vol.3, issue.4, pp.322-329, 1986.

D. Charif and J. R. Lobry, SeqinR 1.0-2 : a contributed package to the R project for statistical computing devoted to biological sequences retrieval and analysis Structural approaches to sequence evolution : Molecules, networks, populations, Biological and Medical Physics, Biomedical Engineering, pp.207-232, 2007.
URL : https://hal.archives-ouvertes.fr/hal-00434576

F. Chen and W. Li, Genomic Divergences between Humans and Other Hominoids and the Effective Population Size of the Common Ancestor of Humans and Chimpanzees, The American Journal of Human Genetics, vol.68, issue.2, pp.444-456, 2001.
DOI : 10.1086/318206

O. F. Christensen, Pseudo-likelihood for Non-reversible Nucleotide Substitution Models with Neighbour Dependent Rates, Statistical Applications in Genetics and Molecular Biology, vol.5, issue.1, 2006.
DOI : 10.2202/1544-6115.1217

O. F. Christensen, A. Hobolth, and J. L. Jensen, Pseudo-Likelihood Analysis of Codon Substitution Models with Neighbor-Dependent Rates, Journal of Computational Biology, vol.12, issue.9, pp.1166-82, 2005.
DOI : 10.1089/cmb.2005.12.1166

R. Christensen, Log-linear models and logistic regression. Springer Texts in Statistics, 1997.

G. A. Churchill, Stochastic models for heterogeneous DNA sequences, Bulletin of Mathematical Biology, vol.45, issue.1, pp.79-94, 1989.
DOI : 10.1007/BF02458837

W. G. Cochran, The $\chi^2$ Test of Goodness of Fit, The Annals of Mathematical Statistics, vol.23, issue.3, pp.315-345, 1952.
DOI : 10.1214/aoms/1177729380

W. J. Conover, Practical nonparametric statistics, 1999.

H. Cramér, Mathematical Methods of Statistics, 1946.

N. Cressie and T. R. Read, Multinomial goodness-of-fit tests, J. Roy. Statist. Soc. Ser. B, vol.46, issue.3, pp.440-464, 1984.

A. P. Dempster, N. M. Laird, and D. B. Rubin, Maximum likelihood from incomplete data via the EM algorithm, J. Roy. Statist. Soc. Ser. B, vol.39, issue.1, pp.1-38, 1977.

J. L. Doob, Stochastic processes, 1953.

L. Duret and N. Galtier, The Covariation Between TpA Deficiency, CpG Deficiency, and G+C Content of Human Isochores Is Due to a Mathematical Artifact, Molecular Biology and Evolution, vol.17, issue.11, pp.1620-1625, 2000.
DOI : 10.1093/oxfordjournals.molbev.a026261

URL : https://hal.archives-ouvertes.fr/hal-00427077

]. J. Felsenstein, Evolutionary trees from DNA sequences: A maximum likelihood approach, Journal of Molecular Evolution, vol.24, issue.6, pp.368-376, 1981.
DOI : 10.1007/BF01734359

R. A. Fisher, On the Interpretation of ?? 2 from Contingency Tables, and the Calculation of P, Journal of the Royal Statistical Society, vol.85, issue.1, pp.87-94, 1922.
DOI : 10.2307/2340521

W. Fitch and E. Markowitz, An improved method for determining codon variability in a gene and its application to the rate of fixation of mutations in evolution, Biochemical Genetics, vol.21, issue.5, pp.579-593, 1970.
DOI : 10.1007/BF00486096

N. Galtier, Maximum-Likelihood Phylogenetic Analysis Under a Covarion-like Model, Molecular Biology and Evolution, vol.18, issue.5, pp.866-873, 2001.
DOI : 10.1093/oxfordjournals.molbev.a003868

G. Gibson and S. V. Muse, Précis de génomique, 2004.

E. J. Gilbert, On the Identifiability Problem for Functions of Finite Markov Chains, The Annals of Mathematical Statistics, vol.30, issue.3, pp.688-697, 1959.
DOI : 10.1214/aoms/1177706199

N. Goldman, Statistical tests of models of DNA substitution, Journal of Molecular Evolution, vol.46, issue.2, pp.182-198, 1993.
DOI : 10.1007/BF00166252

N. Goldman and Z. Yang, A codon-based model of nucleotide substitution for protein-coding DNA sequences, Mol. Biol. Evol, vol.11, issue.5, pp.725-736, 1994.

I. J. Good, Probability and the weighing of evidence, 1950.

R. Grantham, Amino Acid Difference Formula to Help Explain Protein Evolution, Science, vol.185, issue.4154, pp.862-864, 1974.
DOI : 10.1126/science.185.4154.862

M. Guedj, Association of TNFAIP3 rs5029939 variant with systemic sclerosis in European Caucasian population. Under review

S. Guindon and O. Gascuel, A Simple, Fast, and Accurate Algorithm to Estimate Large Phylogenies by Maximum Likelihood, Systematic Biology, vol.52, issue.5, pp.696-704, 2003.
DOI : 10.1080/10635150390235520

M. Hasegawa, H. Kishino, and T. Yano, Dating of the human-ape splitting by a molecular clock of mitochondrial DNA, Journal of Molecular Evolution, vol.275, issue.3, pp.160-174, 1985.
DOI : 10.1007/BF02101694

S. T. Hess, J. D. Blake, and R. D. Blake, Wide variations in neighbor-dependent substitution rates, Journal of Molecular Biology, vol.236, issue.4, pp.1022-1033, 1994.
DOI : 10.1016/0022-2836(94)90009-4

J. L. Jensen and A. K. Pedersen, Probabilistic models of DNA sequence evolution with context dependent rates of substitution, Advances in Applied Probability, vol.11, issue.02, pp.499-517, 2000.
DOI : 10.1089/cmb.1998.5.149

T. H. Jukes and C. R. Cantor, Evolution of Protein Molecules, 1969.
DOI : 10.1016/B978-1-4832-3211-9.50009-7

F. P. Kelly, Reversibility and stochastic networks, 1979.

S. Kim, H. Choi, and S. Lee, Estimate-based goodness-of-fit test for large sparse multinomial distributions, Computational Statistics & Data Analysis, vol.53, issue.4, pp.1122-1131, 2009.
DOI : 10.1016/j.csda.2008.10.011

M. Kimura, A simple method for estimating evolutionary rates of base substitutions through comparative studies of nucleotide sequences, Journal of Molecular Evolution, vol.206, issue.5, Nov., pp.111-120, 1980.
DOI : 10.1007/BF01731581

K. J. Koehler and K. Larntz, An Empirical Investigation of Goodness-of-Fit Statistics for Sparse Multinomials, Journal of the American Statistical Association, vol.36, issue.12, pp.336-344, 1980.
DOI : 10.1080/01621459.1971.10482235

H. H. Ku, A Note on Contingency Tables Involving Zero Frequencies and the 2I Test, Technometrics, vol.5, issue.3, pp.398-400, 1963.
DOI : 10.2307/1266344

S. Kullback, Information theory and statistics, 1959.

S. Kumar and S. B. Hedges, A molecular timescale for molecular evolution, Nature, vol.392, pp.917-920, 1998.

F. Larsen, G. Gundersen, R. Lopez, and H. Prydz, CpG islands as gene markers in the human genome, Genomics, vol.13, issue.4, pp.1095-1107, 1992.
DOI : 10.1016/0888-7543(92)90024-M

S. L. Lauritzen, Graphical models, volume 17 of Oxford Statistical Science Series, 1996.

G. Lunter and J. Hein, A nucleotide substitution model with nearest-neighbour interactions, Bioinformatics, vol.20, issue.Suppl 1, pp.216-223, 2004.
DOI : 10.1093/bioinformatics/bth901

I. Miklos, G. A. Lunter, and I. Holmes, A "Long Indel" Model For Evolutionary Sequence Alignment, Molecular Biology and Evolution, vol.21, issue.3, pp.529-540, 2004.
DOI : 10.1093/molbev/msh043

B. R. Morton, The Influence of Neighboring Base Composition on Substitutions in Plant Chloroplast Coding Sequences, Molecular Biology and Evolution, vol.14, issue.2, pp.189-194, 1997.
DOI : 10.1093/oxfordjournals.molbev.a025752

S. V. Muse and B. S. Gaut, A likelihood approach for comparing synonymous and nonsynonymous nucleotide substitution rates, with application to the chloroplast genome, Mol. Biol. Evol, vol.11, issue.5, pp.715-724, 1994.

J. Neyman and E. S. Pearson, Further Notes on the ?? 2 Distribution, Biometrika, vol.22, issue.3/4, pp.298-305, 1931.
DOI : 10.2307/2332097

P. Nicolas, L. Bize, F. Muri, M. Hoebeke, F. Rodolphe et al., Mining Bacillus subtilis chromosome heterogeneities using hidden Markov models, Nucleic Acids Research, vol.30, issue.6, pp.1418-1426, 2002.
DOI : 10.1093/nar/30.6.1418

URL : http://doi.org/10.1093/nar/30.6.1418

H. Ochman and A. C. Wilson, Evolution in bacteria: Evidence for a universal substitution rate in cellular genomes, Journal of Molecular Evolution, vol.80, issue.3, pp.74-86, 1987.
DOI : 10.1007/BF02111283

A. K. Pedersen, J. L. Jensen-pedersen, C. Wiuf, and F. B. Christiansen, A Dependent-Rates Model and an MCMC-Based Methodology for the Maximum-Likelihood Analysis of Sequences with Overlapping Reading Frames, Molecular Biology and Evolution, vol.18, issue.5, pp.763-776, 1998.
DOI : 10.1093/oxfordjournals.molbev.a003859

T. Petrie, Probabilistic Functions of Finite State Markov Chains, The Annals of Mathematical Statistics, vol.40, issue.1, pp.97-115, 1969.
DOI : 10.1214/aoms/1177697807

R. Development and C. Team, R : A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, 2009.

L. R. Rabiner, A tutorial on hidden markov models and selected applications in speech recognition, Proceedings of the IEEE, pp.257-286, 1989.

T. R. Read and N. A. Cressie, Goodness-of-fit statistics for discrete multivariate data, 1988.
DOI : 10.1007/978-1-4612-4578-0

L. Sachs, Applied statistics Springer Series in Statistics, 1982.

A. C. Siepel and D. Haussler, Combining Phylogenetic and Hidden Markov Models in Biosequence Analysis, Journal of Computational Biology, vol.11, issue.2-3, pp.413-428, 2004.
DOI : 10.1089/1066527041410472

S. Tavaré, Some probabilistic and statistical problems in the analysis of DNA sequences, Lectures on Mathematics in the Life Sciences, vol.17, 1986.

J. L. Thorne, H. Kishino, and J. Felsenstein, An evolutionary model for maximum likelihood alignment of DNA sequences, Journal of Molecular Evolution, vol.80, issue.2, pp.114-124, 1991.
DOI : 10.1007/BF02193625

J. L. Thorne, H. Kishino, and J. Felsenstein, Inching toward reality: An improved likelihood model of sequence evolution, Journal of Molecular Evolution, vol.167, issue.1, pp.3-16, 1992.
DOI : 10.1007/BF00163848

M. Trémolières, I. Combroux, A. Hermann, and P. Nobelis, Conservation status assessment of aquatic habitats within the Rhine floodplain using an index based on macrophytes, Annales de Limnologie - International Journal of Limnology, vol.43, issue.4, pp.233-244, 2007.
DOI : 10.1051/limn:2007002

C. Tuffley and M. Steel, Modeling the covarion hypothesis of nucleotide substitution, Mathematical Biosciences, vol.147, issue.1, pp.63-91, 1998.
DOI : 10.1016/S0025-5564(97)00081-3

J. C. Venter, The Sequence of the Human Genome, Science, vol.291, issue.5507, pp.1304-1351, 2001.
DOI : 10.1126/science.1058040

URL : https://hal.archives-ouvertes.fr/hal-00465088

S. Whelan and N. Goldman, Distributions of Statistics Used for the Comparison of Models of Sequence Evolution in Phylogenetics, Molecular Biology and Evolution, vol.16, issue.9, pp.1292-1299, 1999.
DOI : 10.1093/oxfordjournals.molbev.a026219

S. Whelan and N. Goldman, Estimating the Frequency of Events That Cause Multiple-Nucleotide Changes, Genetics, vol.167, issue.4, pp.2027-2043, 2004.
DOI : 10.1534/genetics.103.023226

C. J. Wu, On the Convergence Properties of the EM Algorithm, The Annals of Statistics, vol.11, issue.1, pp.95-103, 1983.
DOI : 10.1214/aos/1176346060

Z. Yang, Maximum-likelihood estimation of phylogeny from DNA sequences when substitution rates differ over sites, Mol Biol Evol, vol.10, issue.6, pp.1396-1401, 1993.

Z. Yang, Estimating the pattern of nucleotide substitution, Journal of Molecular Evolution, vol.39, issue.1, pp.39-105, 1994.
DOI : 10.1007/BF00178256

Z. Yang, Maximum likelihood phylogenetic estimation from DNA sequences with variable rates over sites: Approximate methods, Journal of Molecular Evolution, vol.11, issue.3, pp.39-306, 1994.
DOI : 10.1007/BF00160154

Z. Yang, A space-time process model for the evolution of DNA sequences, Genetics, vol.139, issue.2, pp.993-1005, 1995.

Z. Yang, R. Nielsen, N. Goldman, and A. K. Pedersen, Codon-substitution models for heterogeneous selection pressure at amino acid sites, Genetics, vol.155, issue.1, pp.431-449, 2000.

Z. Yang and A. D. Yoder, Estimation of the Transition/Transversion Rate Bias and Species Sampling, Journal of Molecular Evolution, vol.48, issue.3, pp.274-283, 1999.
DOI : 10.1007/PL00006470

J. K. Yarnold, The Minimum Expectation in X 2 Goodness of Fit Tests and the Accuracy of Approximations for the Null Distribution, Journal of the American Statistical Association, vol.65, issue.330, pp.864-886, 1970.
DOI : 10.2307/2284594