M. Abramowitz and I. A. Stegun, Handbook of mathematical functions with formulas, graph, and mathematical tables, Applied Mathematics Series, vol.55, p.1046, 1965.

H. Akaike, A new look at the statistical model identification. Automatic Control, IEEE Transactions on, vol.19, issue.6, pp.716-723, 1974.

C. Ambroise, G. Grasseau, M. Hoebeke, and P. Latouche, The mixer package, 2010.

P. Arabie, S. Schleutermann, J. Daws, and L. Hubert, Marketing Applications of Sequencing and Partitioning of Nonsymmetric and/or Two-Mode Matrices, pp.215-224, 1988.
DOI : 10.1007/978-3-642-73489-2_18

B. B. Baker and E. T. Copson, The mathematical theory of Huygens' principle, 1950.

A. Banerjee, S. Merugu, I. S. Dhillon, and J. Ghosh, Clustering with Bregman Divergences, In Journal of Machine Learning Research, 2004.
DOI : 10.1137/1.9781611972740.22

V. Batagelj and P. Doreian, An optimizational approach to regular equivalence, Social Networks, vol.14, issue.1-2, pp.121-135, 1992.
DOI : 10.1016/0378-8733(92)90016-Z

J. C. Bezdek and R. J. Hathaway, VAT: a tool for visual assessment of (cluster) tendency, Proceedings of the 2002 International Joint Conference on Neural Networks. IJCNN'02 (Cat. No.02CH37290), pp.2225-2230, 2002.
DOI : 10.1109/IJCNN.2002.1007487

P. J. Bickel and A. Chen, A nonparametric view of network models and Newman???Girvan and other modularities, Proceedings of the National Academy of Sciences, 2009.
DOI : 10.1073/pnas.0907096106

C. Blake and C. J. , Merz : {UCI} repository of machine learning databases, p.159, 1998.

M. Blei and M. I. Jordan, Variational inference for Dirichlet process mixtures, Bayesian Analysis, vol.1, issue.1, pp.121-144, 2005.
DOI : 10.1214/06-BA104

V. Blondel, J. Guillaume, R. Lambiotte, and E. Lefebvre, Fast unfolding of communities in large networks, Journal of Statistical Mechanics: Theory and Experiment, vol.2008, issue.10, 2008.
DOI : 10.1088/1742-5468/2008/10/P10008

URL : https://hal.archives-ouvertes.fr/hal-01146070

V. D. Blondel and G. , Krings et I. Thomas : Regions and borders of mobile telephony in Belgium and in the Brussels metropolitan zone, 2010.

S. Boriah, V. Chandola, and V. Kumar, Similarity Measures for Categorical Data: A Comparative Evaluation, SDM, pp.243-254, 2008.
DOI : 10.1137/1.9781611972788.22

M. Boullé, Recherche d'une représentation des données efficace pour la fouille de grandes bases de données, Thèse de doctorat, École Nationale des Télécommunications, 2007.

M. Boullé, Bivariate data grid models for supervised learning, 2008.

M. Boullé, Estimation de la densité d'arcs dans les graphes de grande taille : une alternative à la détection de clusters, Extraction et gestion des connaissances (EGC'2011), pp.353-364, 2011.

M. Boullé, Functional data clustering via piecewise constant nonparametric density estimation, Pattern Recognition, vol.45, issue.12, pp.4389-4401, 2012.
DOI : 10.1016/j.patcog.2012.05.016

M. Boullé, Sélection bayésienne de modèles avec prior dépendant des données, Extraction et gestion des connaissances (EGC'2012), pp.29-34, 2012.

M. Boullé, R. Guigourès, and F. Rossi, Nonparametric Hierarchical Clustering of Functional Data, Advances in Knowledge Discovery and Management, 2013.
DOI : 10.1007/978-3-319-02999-3_2

U. Brandes, D. Delling, M. Gaertler, R. Görke, M. Hoefer et al., On modularity-np-completeness and beyond, 2006.

R. L. Breiger and S. A. Boorman, Arabie : An algorithm for clustering relational data with applications to social network analysis and comparison with multidimensional scaling, Journal of Mathematical Psychology, vol.12, issue.3, 1975.

F. Chamroukhi, A. Samé, and G. Govaert, Aknin : A hidden process regression model for functional data description. application to curve discrimination, Neurocomputing, vol.73, pp.7-91210, 2010.

M. Charrad and M. B. Ahmed, Simultaneous Clustering: A Survey, Pattern Recognition and Machine Intelligence, pp.370-375, 2011.
DOI : 10.1101/gr.648603

URL : https://hal.archives-ouvertes.fr/hal-01125890

Y. Cheng and G. M. Church, Biclustering of expression data, Proceedings of the eighth international conference on intelligent systems for molecular biology, pp.93-103, 2000.

C. H. Coombs, R. M. Dawes, and A. Tversky, Mathematical psychology : An elementary introduction, 1970.

T. M. Cover and J. A. Thomas, Elements of information theory, 2006.

M. N. Dash, K. Choi, P. Scheuermann, and H. Liu, Feature selection for clustering - a filter solution, 2002 IEEE International Conference on Data Mining, 2002. Proceedings., pp.115-122, 2002.
DOI : 10.1109/ICDM.2002.1183893

D. Bie, An information theoretic framework for data mining, KDD, pp.564-572, 2011.

I. S. Dhillon, Co-clustering documents and words using bipartite spectral graph partitioning, Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining , KDD '01, pp.269-274, 2001.
DOI : 10.1145/502512.502550

I. S. Dhillon, S. Mallela, and R. Kumar, Enhanced word clustering for hierarchical text classification, Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining , KDD '02, pp.191-200, 2002.
DOI : 10.1145/775047.775076

I. S. Dhillon, S. Mallela, and D. S. Modha, Information-theoretic coclustering, KDD '03, pp.89-98, 2003.

P. J. Diggle, Statistical analysis of spatial point patterns, 1983.

P. Doreian and V. Batagelj, Ferligoj : Generalized blockmodeling of two-mode network data, 2004.

R. O. Duda, P. E. Hart, and D. G. , Stork : Unsupervised learning and clustering. Pattern classification, p.571, 2001.

J. G. Dy and C. E. Brodley, Feature selection for unsupervised learning, The Journal of Machine Learning Research, vol.5, pp.845-889, 2004.

T. Eckes and P. Orlik, An error variance approach to two-mode hierarchical clustering, Journal of Classification, vol.58, issue.1, pp.51-74, 1993.
DOI : 10.1007/BF02638453

S. Fortunato, Community detection in graphs, Physics Reports, vol.486, issue.3-5, pp.75-174, 2010.
DOI : 10.1016/j.physrep.2009.11.002

C. Fraley and A. E. Raftery, How many clusters ? Which clustering method ? Answers via model-based cluster analysis. The computer journal, pp.41578-588, 1998.

M. Friendly, Mosaic Displays for Multi-Way Contingency Tables, Journal of the American Statistical Association, vol.3, issue.425, pp.190-200, 1994.
DOI : 10.1080/00031305.1974.10479053

A. E. Gelfand and A. F. Smith, Sampling-Based Approaches to Calculating Marginal Densities, Journal of the American Statistical Association, vol.4, issue.410, pp.398-409, 1990.
DOI : 10.1080/01621459.1986.10478240

L. Geng and H. J. Hamilton, Interestingness measures for data mining, ACM Computing Surveys, vol.38, issue.3, 2006.
DOI : 10.1145/1132960.1132963

M. Girvan and M. E. Newman, Community structure in social and biological networks, Proceedings of the National Academy of Sciences, pp.7821-7826, 2002.
DOI : 10.1073/pnas.122653799

R. Y. Gnabéli, La production d'une identité autochtone en côte d'Ivoire, pp.114-115247, 2008.

A. Goldenberg, A. X. Zheng, S. E. Fienberg, and E. M. , Airoldi : A survey of statistical network models, Machine Learning, pp.129-233, 2009.

G. Govaert, Algorithme de classification d'un tableau de contingence, First international symposium on data analysis and informatics, pp.487-500, 1977.

G. Govaert, Simultaneous clustering of rows and columns, Control and Cybernetics, vol.24, issue.4, pp.437-458, 1995.

G. Govaert and M. Nadif, Clustering with block mixture models, Pattern Recognition, vol.36, issue.2, pp.463-473, 2003.
DOI : 10.1016/S0031-3203(02)00074-2

G. Govaert, Nadif : Co-Clustering, 2013.

P. D. Grünwald, The Minimum Description Length Principle, 2007.

R. Guigourès and M. Boullé, Segmentation of towns using call detail records, 2011.

R. Guigourès, M. Boullé, and F. Rossi, A Triclustering Approach for Time Evolving Graphs, 2012 IEEE 12th International Conference on Data Mining Workshops, pp.115-122, 2012.
DOI : 10.1109/ICDMW.2012.61

R. Guigourès, M. Boullé, and F. Rossi, Étude des corrélations spatiotemporelles des appels mobiles en France, Extraction et gestion des connaissances, pp.437-448, 2013.

I. Guyon, Elisseeff : An introduction to variable and feature selection, The Journal of Machine Learning Research, vol.3, pp.1157-1182, 2003.

P. Hansen and N. Mladenovic, Variable neighborhood search: Principles and applications, European Journal of Operational Research, vol.130, issue.3, pp.449-467, 2001.
DOI : 10.1016/S0377-2217(00)00100-4

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.93.1769

T. Hastie, R. Tibshirani, and J. Friedman, The Elements of Statistical Learning, 2009.

G. Hébrail, B. Hugueney, Y. Lechevallier, and F. Rossi, Exploratory analysis of functional data via clustering and optimal segmentation, Neurocomputing, vol.73, issue.7-9, pp.7-91125, 2010.
DOI : 10.1016/j.neucom.2009.11.022

J. L. Hintze and R. D. Nelson, Violin plots : a box plot-density trace synergism. The American Statistician, pp.181-184, 1998.

P. W. Holland, K. Laskey, and S. Leinhardt, Stochastic blockmodels: First steps, Social Networks, vol.5, issue.2, pp.109-137, 1983.
DOI : 10.1016/0378-8733(83)90021-7

J. Hopcroft, O. Khan, and B. Kulis, Selman : Tracking evolving communities in large linked networks, pp.5249-5253, 2004.

L. Hubert, Comparing partitions, Journal of Classification, vol.78, issue.1, pp.193-218, 1985.
DOI : 10.1007/BF01908075

A. K. Jain, Data clustering: 50 years beyond K-means, Pattern Recognition Letters, vol.31, issue.8, pp.31651-666, 2010.
DOI : 10.1016/j.patrec.2009.09.011

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.151.4286

A. K. Jain and R. C. , Dubes : Algorithms for clustering data, 1988.

M. I. Jordan, Z. Ghahramani, T. S. Jaakkola, and L. K. Saul, An Introduction to Variational Methods for Graphical Models, Machine learning, vol.37, issue.2, pp.183-233, 1999.
DOI : 10.1007/978-94-011-5014-9_5

C. Kemp and J. B. Tenenbaum, Learning systems of concepts with an infinite relational model, AAAI'06, 2006.

Y. Kluger, R. Basri, J. T. Chang, and M. Gerstein, Spectral Biclustering of Microarray Data: Coclustering Genes and Conditions, Genome Research, vol.13, issue.4, pp.703-716, 2003.
DOI : 10.1101/gr.648603

S. Kullback and R. A. , On Information and Sufficiency, The Annals of Mathematical Statistics, vol.22, issue.1, pp.79-86, 1951.
DOI : 10.1214/aoms/1177729694

P. Latouche, E. Birmelé, and C. Ambroise, Bayesian Methods for Graph Clustering, Advances in Data Analysis, Data Handling and Business Intelligence, pp.229-239, 2010.
DOI : 10.1007/978-3-642-01044-6_21

URL : https://hal.archives-ouvertes.fr/hal-00629294

J. Leskovec, J. Kleinberg, and C. Faloutsos, Graphs over time, Proceeding of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining , KDD '05, pp.177-187, 2005.
DOI : 10.1145/1081870.1081893

Y. H. Li and A. K. Jain, Classification of Text Documents, The Computer Journal, vol.41, issue.8, pp.41537-546, 1998.
DOI : 10.1093/comjnl/41.8.537

J. Lin, Divergence measures based on the Shannon entropy, IEEE Transactions on Information Theory, vol.37, issue.1, pp.145-151, 1991.
DOI : 10.1109/18.61115

H. Liu and L. Yu, Toward integrating feature selection algorithms for classification and clustering. Knowledge and Data Engineering, IEEE Transactions on, vol.17, issue.4, pp.491-502, 2005.

F. Lorrain and H. C. White, Structural equivalence of individuals in social networks, The Journal of Mathematical Sociology, vol.18, issue.1, pp.49-80, 1971.
DOI : 10.1086/223084

S. C. Madeira and A. L. Olivieira, Biclustering algorithms for biological data analysis: a survey, IEEE/ACM Transactions on Computational Biology and Bioinformatics, vol.1, issue.1, pp.24-45, 2004.
DOI : 10.1109/TCBB.2004.2

G. W. Milligan and M. C. Cooper, An examination of procedures for determining the number of clusters in a data set, Psychometrika, vol.77, issue.2, pp.159-179, 1985.
DOI : 10.1007/BF02294245

B. Mirkin, Mathematical classification and clustering, 1996.
DOI : 10.1007/978-1-4613-0457-9

B. Mirkin, P. Arabie, and L. Hubert, Additive two-mode clustering: The error-variance approach revisited, Journal of Classification, vol.86, issue.11, pp.243-263, 1995.
DOI : 10.1007/BF03040857

M. Nadif, Govaert : Model-based co-clustering for continuous data, ICMLA, pp.175-180, 2010.

R. M. Neal, Markov chain sampling methods for Dirichlet process mixture models, Journal of Computational AND Graphical Statistics, vol.9, issue.2, pp.249-265, 2000.

A. Y. Ng, M. I. Jordan, and Y. Weiss, On spectral clustering : Analysis and an algorithm Advances in neural information processing systems, pp.849-856, 2002.

X. L. Nguyen and A. Gelfand, The Dirichlet labeling process for clustering functional data, Statistica Sinica, vol.21, issue.3, pp.1249-1289, 2011.
DOI : 10.5705/ss.2008.285

K. Nowicki and T. Snijders, Estimation and Prediction for Stochastic Blockstructures, Journal of the American Statistical Association, vol.96, issue.455, pp.1077-1087, 2001.
DOI : 10.1198/016214501753208735

G. Palla, A. Barabási, and T. Vicsek, Quantifying social group evolution, Nature, vol.21, issue.7136, p.446, 2007.
DOI : 10.1038/nature05670

G. Palla, I. Derenyi, I. Farkas, and T. Vicsek, Uncovering the overlapping community structure of complex networks in nature and society, Nature, vol.387, issue.7043, pp.814-818, 2005.
DOI : 10.1038/nature03248

J. Pitman, Combinatorial stochastic processes, volume 1875 de Lecture Notes in Mathematics, 2006.

A. Pothen, H. D. Simon, and K. Liou, Partitioning Sparse Matrices with Eigenvectors of Graphs, SIAM Journal on Matrix Analysis and Applications, vol.11, issue.3, pp.430-452, 1990.
DOI : 10.1137/0611030

H. Ralambondrainy, A conceptual version of the K-means algorithm, Pattern Recognition Letters, vol.16, issue.11, pp.1147-1157, 1995.
DOI : 10.1016/0167-8655(95)00075-R

J. O. Ramsay and B. W. , Silverman : Functional Data Analysis, 2005.

W. M. Rand, Objective Criteria for the Evaluation of Clustering Methods, Journal of the American Statistical Association, vol.15, issue.336, pp.846-850, 1971.
DOI : 10.1080/01621459.1963.10500845

J. Reichardt and D. R. White, Role models for complex networks, The European Physical Journal B, vol.9, issue.2, 2007.
DOI : 10.1140/epjb/e2007-00340-y

J. Rissanen, Modeling by shortest data description, Automatica, vol.14, issue.5, pp.465-471, 1978.
DOI : 10.1016/0005-1098(78)90005-5

F. Rossi and N. Villa-vialaneix, Représentation d'un grand réseau à partir d'une classification hiérarchique de ses sommets, Journal de la Société Française de Statistique, pp.34-65, 2012.

P. J. Rousseeuw, Silhouettes: A graphical aid to the interpretation and validation of cluster analysis, Journal of Computational and Applied Mathematics, vol.20, pp.53-65, 1987.
DOI : 10.1016/0377-0427(87)90125-7

S. E. Schaeffer, Graph clustering, Computer Science Review, vol.1, issue.1, pp.27-64, 2007.
DOI : 10.1016/j.cosrev.2007.05.001

H. Shan and A. Banerjee, Bayesian Co-clustering, 2008 Eighth IEEE International Conference on Data Mining, pp.530-539, 2008.
DOI : 10.1109/ICDM.2008.91

C. E. Shannon, A mathematical theory of communication, Bell system tech. journal, vol.27, 1948.

A. Silberschatz, Tuzhilin : On subjective measure of interestingness in knowledge discovery, KDD, pp.275-281, 1995.

N. Slonim, Tishby : Document clustering using word clusters via the information bottleneck method, ACM SIGIR 2000, pp.208-215, 2000.

A. Strehl and J. Ghosh, Cluster ensembles ? a knowledge reuse framework for combining multiple partition, JMLR, vol.3, pp.583-617, 2003.

J. Sun, C. Faloutsos, S. Papadimitriou, and P. S. Yu, GraphScope, Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining , KDD '07, pp.687-696, 2007.
DOI : 10.1145/1281192.1281266

R. Tibshirani, G. Walther, and T. Hastie, Estimating the number of clusters in a data set via the gap statistic, Journal of the Royal Statistical Society: Series B (Statistical Methodology), vol.63, issue.2, pp.411-423, 2001.
DOI : 10.1111/1467-9868.00293

I. Van-mechelen, H. Bock, and P. D. Boeck, Two-mode clustering methods : a structured overview. Statistical methods in medical research, pp.363-394, 2004.

J. E. Vogt, S. Prabhakaran, T. J. Fuchs, and V. Roth, The translationinvariant Wishart-Dirichlet process for clustering distance data, 2010.

C. S. Wallace and D. M. , An Information Measure for Classification, The Computer Journal, vol.11, issue.2, pp.185-194, 1968.
DOI : 10.1093/comjnl/11.2.185

H. M. Wallach, S. T. Jensen, L. D. Et, and K. A. Heller, An alternative prior process for nonparametric bayesian clustering, AISTATS, pp.892-899, 2010.

S. Wasserman and K. Faust, Social Network Analysis : Methods and Applications . Structural analysis in the social sciences, 1994.
DOI : 10.1017/CBO9780511815478

D. R. White and K. P. Reitz, Graph and semigroup homomorphisms on networks of relations, Social Networks, vol.5, issue.2, 1983.
DOI : 10.1016/0378-8733(83)90025-4

H. C. White, S. Boorman, and R. Breiger, Social Structure from Multiple Networks. I. Blockmodels of Roles and Positions, American Journal of Sociology, vol.81, issue.4, pp.730-80, 1976.
DOI : 10.1086/226141

R. D. Wilson and T. R. Martinez, Improved Heterogeneous Distance Functions, Journal of Artificial Intelligence Research, vol.6, pp.1-34, 1997.

E. P. Xing, W. Fu, and L. , A state-space mixed membership blockmodel for dynamic network tomography, The Annals of Applied Statistics, vol.4, issue.2, pp.535-566, 2010.
DOI : 10.1214/09-AOAS311

L. Zhao and M. J. Zaki, TRICLUSTER, Proceedings of the 2005 ACM SIGMOD international conference on Management of data , SIGMOD '05, pp.694-705, 2005.
DOI : 10.1145/1066157.1066236