D. Achlioptas and F. Mcsherry, On Spectral Learning of Mixtures of Distributions, Learning Theory, pp.458-469, 2005.
DOI : 10.1007/11503415_31

M. Adams and A. B. , On density estimation from ergodic processes. The Annals of Probability, pp.794-804, 1998.

M. Adams and A. B. , Uniform approximation of Vapnik???Chervonenkis classes, Bernoulli, vol.18, issue.4, pp.1310-1319, 2012.
DOI : 10.3150/11-BEJ379

L. Akoglu and C. Faloutsos, Event detection in time series of mobile communication graphs, Proceedings of the of Army Science Conference, pp.1-8, 2010.

. Algoet, Universal schemes for prediction, gambling and portfolio selection. The Annals of Probability, pp.901-941, 1992.

P. Algoet, Universal schemes for learning the best nonlinear predictor given the infinite past and side information. Information Theory, IEEE Transactions on, vol.45, issue.4, pp.1165-1185, 1999.

M. F. Balcan and P. Gupta, Robust hierarchical clustering, The 23rd Annual Conference on Learning Theory (COLT), p.36, 2010.

M. Basseville and I. V. Nikiforov, Detection of abrupt changes: theory and application
URL : https://hal.archives-ouvertes.fr/hal-00008518

C. Biernacki, G. Celeux, and G. Govaert, Assessing a mixture model for clustering with the integrated completed likelihood, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.22, issue.7, pp.719-725, 2000.
DOI : 10.1109/34.865189

P. Billingsley, Statistical Methods in Markov Chains, The Annals of Mathematical Statistics, vol.32, issue.1, pp.12-40, 1961.
DOI : 10.1214/aoms/1177705136

P. Billingsley, Ergodic theory and information, p.42, 1965.

R. Bolton and D. Hand, Statistical fraud detection: A review, Statistical Science, vol.17, p.21, 2002.

N. Bouguila and D. Ziou, Online clustering via finite mixtures of Dirichlet and minimum message length, Engineering Applications of Artificial Intelligence, vol.19, issue.4, pp.371-379, 2006.
DOI : 10.1016/j.engappai.2006.01.012

B. Brodsky and B. Darkhovsky, Non-parametric methods in change-point problems Mathematics and its applications, p.32, 1993.

B. Brodsky and B. Darkhovsky, Non-parametric statistical diagnosis: problems and methods, p.32, 2000.
DOI : 10.1007/978-94-015-9530-8

B. Brodsky and B. Darkhovsky, Sequential change-point detection for mixing random sequences under composite hypotheses, Statistical Inference for Stochastic Processes, pp.35-54, 2008.
DOI : 10.1007/s11203-006-9004-6

I. Cadez, S. Gaffney, and P. Smyth, A general probabilistic framework for clustering individuals and objects, Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining , KDD '00, pp.140-149, 2000.
DOI : 10.1145/347090.347119

E. Carlstein, Non-parametric Change-Point Estimation. The Annals of Statistics, pp.188-197, 1988.

E. Carlstein and S. Lele, Non-parametric change-point estimation for data from an ergodic sequence. Teorya Veroyatnostei i ee Primeneniya, pp.910-917, 1993.

N. Cesa-bianchi and G. Lugosi, Prediction, Learning, and Games, pp.31-104, 2006.
DOI : 10.1017/CBO9780511546921

J. Chen, Parametric statistical change point analysis, p.32, 2012.

I. Csiszar and P. C. Shields, Notes on information theory and statistics, Foundations and Trends in Communications and Information Theory, pp.30-42, 2004.

M. Csörgö and L. Horváth, Limit Theorems in Change-Point Analysis (Wiley Series in Probability & Statistics), p.32, 1998.

S. Dasgupta, Learning mixtures of Gaussians, 40th Annual Symposium on Foundations of Computer Science (Cat. No.99CB37039), pp.634-644, 1999.
DOI : 10.1109/SFFCS.1999.814639

L. Dumbgen, The asymptotic behavior of some non-parametric change-point estimators. The Annals of Statistics, pp.1471-1495, 1991.

E. B. Fox, E. B. Sudderth, M. I. Jordan, and A. S. Willsky, A sticky HDP-HMM with application to speaker diarization, The Annals of Applied Statistics, vol.5, issue.2A, pp.1020-1056, 2011.
DOI : 10.1214/10-AOAS395SUPP

L. Giraitis, R. Leipus, and D. Surgailis, The change-point problem for dependent observations, Journal of Statistical Planning and Inference, vol.53, issue.3, pp.1-15, 1995.
DOI : 10.1016/0378-3758(95)00148-4

R. Gray, Probability, Random Processes, and Ergodic Properties, pp.27-43, 1988.

S. Hariz, J. Wylie, and Q. Zhang, Optimal rate of convergence for nonparametric change-point estimators for nonstationary sequences, The Annals of Statistics, vol.35, issue.4, pp.1802-1826, 2007.
DOI : 10.1214/009053606000001596

A. Jain, Data clustering: 50 years beyond K-means, Pattern Recognition Letters, vol.31, issue.8, pp.31-651, 2010.
DOI : 10.1016/j.patrec.2009.09.011

T. Jebara, Y. Song, and K. Thadani, Spectral Clustering and Embedding with Hidden Markov Models, European Conference on Machine Learning (ECML) 2007, pp.164-175, 2007.
DOI : 10.1007/978-3-540-74958-5_18

A. Khaleghi and D. Ryabko, Locating changes in highly-dependent data with unknown number of change points, Neural Information Processing Systems (NIPS), Lake Tahoe, Nevada, United States, p.49, 2012.
URL : https://hal.archives-ouvertes.fr/hal-00765436

A. Khaleghi and D. Ryabko, Non-parametric multiple change point estimation in highly dependent time series, Proceedings of the 24th International Conference on Algorithmic Learning Theory (ALT'13), p.49

A. Khaleghi and D. Ryabko, Asymptotically consistent estimation of the number of change points in highly dependent time series, Proceedings of the 31st International Conference on Machine Learning, p.49, 2014.
URL : https://hal.archives-ouvertes.fr/hal-01026583

A. Khaleghi, D. Ryabko, J. Mary, and P. Preux, Online clustering of processes, the international conference on Artificial Intelligence & Statistics (AI & Stats), pp.601-609
URL : https://hal.archives-ouvertes.fr/hal-00765462

J. Kleinberg, An impossibility theorem for clustering, 15th Conference Neiral Information Processing Systems (NIPS'02), pp.446-453, 2002.

P. Kokoszka and R. Leipus, Detection and estimation of changes in regime. Long-Range Dependence: Theory and Applications, pp.325-337, 2002.

M. Kumar, N. R. Patel, and J. Woo, Clustering seasonality patterns in the presence of errors, Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining , KDD '02, pp.557-563, 2002.
DOI : 10.1145/775047.775129

. Lavielle, Detection of multiple changes in a sequence of dependent variables. Stochastic Processes and their Applications, pp.79-102, 1999.

M. Lavielle, Using penalized contrasts for the change-point problem, Signal Processing, vol.85, issue.8, pp.1501-1510, 2005.
DOI : 10.1016/j.sigpro.2005.01.012

URL : https://hal.archives-ouvertes.fr/inria-00070662

M. Lavielle and G. Teyssiere, Adaptive Detection of Multiple Change-Points in Asset Price Volatility, Long memory in economics, pp.129-156, 2007.
DOI : 10.1007/978-3-540-34625-8_5

E. Lebarbier, Detecting multiple change-points in the mean of Gaussian process by model selection, Signal Processing, vol.85, issue.4, pp.717-736, 2005.
DOI : 10.1016/j.sigpro.2004.11.012

URL : https://hal.archives-ouvertes.fr/inria-00071847

C. Lévy-leduc and F. Roueff, Detection and localization of change-points in highdimensional network traffic data. The Annals of Applied Statistics, pp.637-662, 2009.

C. Li and G. Biswas, Applying the hidden markov model methodology for unsupervised learning of temporal data, International Journal of Knowledge Based Intelligent Engineering Systems, vol.6, issue.3, pp.152-160, 2002.

L. Li and B. A. Prakash, Time series clustering: Complex is simpler, the 28th International Conference on Machine Learning (ICML'11), pp.118-119, 2011.

A. Lung-yut-fong, C. Lévy-leduc, and O. Cappé, Robust retrospective multiple change-point estimation for multivariate data, 2011 IEEE Statistical Signal Processing Workshop (SSP), pp.405-408, 2011.
DOI : 10.1109/SSP.2011.5967716

URL : https://hal.archives-ouvertes.fr/hal-00564410

A. Lung-yut-fong, C. Lévy-leduc, and O. Cappé, Distributed detection/localization of change-points in high-dimensional network traffic data, Statistics and Computing, vol.3, issue.3, pp.485-496, 2012.
DOI : 10.1007/s11222-011-9240-5

URL : https://hal.archives-ouvertes.fr/hal-00420862

M. Balcan, A. Blum, and S. Vempala, A discriminative framework for clustering via similarity functions, Proceedings of the fourtieth annual ACM symposium on Theory of computing, STOC 08, pp.23-98, 2008.
DOI : 10.1145/1374376.1374474

M. Mahajan, P. Nimbhorkar, and K. Varadarajan, The planar k-means problem is nphard, WALCOM '09: Proceedings of the 3rd International Workshop on Algorithms and Computation, pp.274-285, 2009.

P. Massart, A Non-asymptotic Theory for Model Selection, European Congress of Mathematics, pp.309-323, 2005.
DOI : 10.4171/009-1/20

P. Mccullagh and J. Yang, How many clusters? Bayesian Analysis, pp.101-120, 2008.

S. Morvai, L. Yakowitz, and . Gyorfi, Non-parametric inference for ergodic, stationary time series, Annals of Statistics, vol.24, issue.1, pp.370-379, 1996.

S. Morvai, P. Yakowitz, and . Algoet, Weakly convergent non-parametric forecasting of stationary time series Information Theory, IEEE Transactions on, vol.43, issue.2, pp.483-498, 1997.

S. Morvai, P. Yakowitz, and . Algoet, Weakly convergent non-parametric forecasting of stationary time series. Information Theory, IEEE Transactions on, vol.43, issue.2, pp.483-498, 1997.

H. Müller and D. Siegmund, Change-point problems. Ims, p.32, 1994.

D. Ornstein, Ergodic Theory, Randomness, and Dynamical Systems, p.46, 1974.

A. Panuccio, M. Bicego, and V. Murino, A hidden markov modelbased approach to sequential data clustering, pp.734-742, 2002.

P. Papantoni-kazakos and A. Burrell, Robust Sequential Algorithms for the Detection of Changes in Data Generating Processes, Journal of Intelligent & Robotic Systems, vol.13, issue.1, pp.3-17, 2010.
DOI : 10.1007/s10846-010-9405-z

F. Picard, S. Robin, M. Lavielle, C. Vaisse, and J. Daudin, A statistical approach for array cgh data analysis, BMC Bioinformatics, vol.6, issue.1, pp.27-48, 2005.
DOI : 10.1186/1471-2105-6-27

URL : https://hal.archives-ouvertes.fr/hal-00427846

. Ryabko, Prediction of random sequences and universal coding. Problems of Information Transmission, pp.87-96, 1988.

D. Ryabko, Clustering processes, Proceedings of the the 27th International Conference on Machine Learning (ICML 2010), pp.919-926, 2010.
URL : https://hal.archives-ouvertes.fr/inria-00477238

. Ryabko, Discrimination Between B-Processes is Impossible, Journal of Theoretical Probability, vol.44, issue.6, pp.565-575, 2010.
DOI : 10.1007/s10959-009-0263-1

URL : https://hal.archives-ouvertes.fr/hal-00639537

D. Ryabko, Sequence prediction in realizable and non-realizable cases, Proceedings of the the 23rd Conference on Learning Theory (COLT 2010), pp.119-131, 2010.
URL : https://hal.archives-ouvertes.fr/inria-00440669

D. Ryabko, On the relation between realizable and non-realizable cases of the sequence prediction problem, Journal of Machine Learning Research (JMLR), vol.12, pp.2161-2180, 2011.
URL : https://hal.archives-ouvertes.fr/hal-00639474

D. Ryabko, Testing composite hypotheses about discrete ergodic processes, TEST, vol.56, issue.3, pp.317-329, 2012.
DOI : 10.1007/s11749-011-0245-3

URL : https://hal.archives-ouvertes.fr/hal-00639477

D. Ryabko and J. Mary, Reducing statistical time-series problems to binary classification, Neural Information Processing Systems (NIPS), pp.2069-2077, 2012.
URL : https://hal.archives-ouvertes.fr/hal-00675637

D. Ryabko and B. Ryabko, Nonparametric Statistical Inference for Ergodic Processes, IEEE Transactions on Information Theory, vol.56, issue.3, pp.1430-1435, 2010.
DOI : 10.1109/TIT.2009.2039169

URL : https://hal.archives-ouvertes.fr/inria-00269249

Z. Shi and G. Joydeep, A unified framework for model-based clustering, Journal of Machine Learning Research, vol.4, pp.1001-1037, 2003.

P. Shields, The Ergodic Theory of Discrete Sample Paths, AMS Bookstore, vol.13, issue.46, pp.20-112, 1996.
DOI : 10.1090/gsm/013

P. Smyth, Clustering sequences with hidden Markov models, Advances in Neural Information Processing Systems, pp.648-654, 1997.

R. Solomonoff, Complexity-based induction systems: Comparisons and convergence theorems, IEEE Transactions on Information Theory, vol.24, issue.4, pp.422-432, 1978.
DOI : 10.1109/TIT.1978.1055913

G. Stephen, String searching algorithms, World Scientific publishing company, vol.3, p.28, 1994.
DOI : 10.1142/2418

M. Sudan, List decoding, Theoretical Computer Science: Exploring New Frontiers of Theoretical Informatics, pp.25-41, 2000.
DOI : 10.1145/346048.346049

M. Talih and N. Hengartner, Structural learning with time-varying components: tracking the cross-section of financial time series, Journal of the Royal Statistical Society: Series B (Statistical Methodology), vol.12, issue.3, pp.321-341, 2005.
DOI : 10.1214/aos/1069362315

A. Tartakovsky, B. Rozovskii, R. Blazek, and H. Kim, A novel approach to detection of intrusions in computer networks via adaptive sequential and batch-sequential change-point detection methods, IEEE Transactions on Signal Processing, vol.54, issue.9, pp.3372-3382, 2006.
DOI : 10.1109/TSP.2006.879308

E. Ukkonen, On-line construction of suffix trees, Algorithmica, vol.10, issue.3, pp.249-260, 1995.
DOI : 10.1007/BF01206331

J. Vert and K. Bleakley, Fast detection of multiple change-points shared by many signals using group lars, NIPS, pp.2343-2351, 2010.

L. Vostrikova, Detecting disorder in multidimensional random processes, Soviet Mathematics Doklady, vol.24, pp.55-59, 1981.

Y. Yao, Estimating the number of change-points via Schwarz' criterion, Statistics & Probability Letters, vol.6, issue.3, pp.181-189, 1988.
DOI : 10.1016/0167-7152(88)90118-6