C. C. Aggarwal and C. Zhai, A survey of text classification algorithms, Mining text data, pp.163-222, 2012.
DOI : 10.1007/978-1-4614-3223-4_6

URL : http://charuaggarwal.net/text-class.pdf

M. Ahat, S. B. Amor, M. Bui, M. Lamure, and M. Courel, Pollution Modeling and Simulation with Multi-Agent and Pretopology, Complex Sciences, First International Conference, vol.4, pp.225-231, 2009.
DOI : 10.1007/978-3-642-02466-5_20

URL : https://hal.archives-ouvertes.fr/hal-01121176

M. Ahat, B. Amor, S. , M. Bui, S. Jhean-larose et al., Document Classification with LSA and Pretopology, Studia Informatica Universalis, vol.8, issue.1, 2010.
URL : https://hal.archives-ouvertes.fr/halshs-00642761

A. Ahmed, M. Aly, J. Gonzalez, S. Narayanamurthy, and A. J. Smola, Scalable inference in latent variable models, Proceedings of the fifth ACM international conference on Web search and data mining, pp.123-132, 2012.
DOI : 10.1145/2124295.2124312

L. M. Aiello, A. Barrat, R. Schifanella, C. Cattuto, B. Markines et al., Friendship Prediction and Homophily in Social Media, ACM Trans. Web, vol.6, issue.2, p.33, 2012.
DOI : 10.1145/2180861.2180866

URL : https://hal.archives-ouvertes.fr/hal-00718085

S. M. Ali and S. D. Silvey, A General Class of Coefficients of Divergence of One Distribution from Another, Journal of the Royal Statistical Society. Series B (Methodological), vol.28, issue.1, pp.131-142, 1966.

G. M. , Validity of the Single Processor Approach to Achieving Large Scale Computing Capabilities, Spring Joint Computer Conference, AFIPS '67 (Spring), pp.483-485, 1967.

S. B. Amor, Percolation, prétopologie et multialéatoires,contributions à la modéli-sation des systèmes complexes : exemple du contrôle aérien, Ecole Pratique des Hautes Etudes, 2008.

S. B. Amor and M. Bui, Généralisation des processus de percolation discrets, Stud. Inform. Univ, vol.7, issue.1, pp.78-93, 2009.

S. B. Amor, M. Bui, and M. Lamure, Modeling urban aerial pollution using stochastic pretopology, Africa Mathematics Annals, vol.1, issue.1, pp.7-19, 2010.

S. B. Amor, V. Levorato, and I. Lavallée, Generalized Percolation Processes Using Pretopology Theory, 2007 IEEE International Conference on Research, Innovation and Vision for the Future in Computing & Communication Technologies, pp.130-134, 2007.
DOI : 10.1109/rivf.2007.369146

URL : https://hal.archives-ouvertes.fr/hal-00460599

M. Archoun, Modélisation prétopologique de la segmentation par croissance de régions des images à niveau de gris, 1983.

G. Arnaud, M. Lamure, M. Terrenoire, and D. Tounissoux, Analysis of the connectivity of an object in a binary image: a pretopological approach, Proc.of the 8th IAPR Conference, 1986.

A. and D. E. Culler, Dataflow Architectures, Annual Review of Computer Science, vol.1, issue.1, pp.225-253, 1986.

J. Auray, Contribution à l'étude des structures pauvres, 1982.

J. , Auray. Structures pauvres. Stud. Inform. Univ, vol.7, issue.1, pp.94-130, 2009.

J. Auray, M. Brissaud, and G. Duru, Les apports de la prétopologie, 112e Congrès national des sociétés savantes, vol.IV, pp.15-29, 1987.

J. Auray, G. Duru, and M. Mougeot, A pretopological analysis of input output model, Economics letter, vol.2, issue.4, 1979.

C. Basileu, Modélisation structurelle des réseaux sociaux : application à un système d'aide à la décision en cas de crise sanitaire, vol.1, 2011.

C. Basileu, S. B. Amor, M. Bui, and M. Lamure, Prétopologie stochastique et réseaux complexes, Stud. Inform. Univ, vol.10, issue.2, pp.73-138, 2012.

Z. Belmandt, Manuel de prétopologie et ses applications, 1993.

Z. Belmandt, Basics of Pretopology. Hermann, 2011.

C. Berge, The Theory of Graphs. Courier Corporation, 1962.

D. M. Blei, Probabilistic topic models, Communications of the ACM, vol.55, issue.4, pp.77-84, 2012.

D. M. Blei, A. Y. Ng, and M. I. Jordan, Latent Dirichlet Allocation. J. Mach. Learn. Res, vol.3, pp.993-1022, 2003.

S. Bonnevay, Extraction de caractéristiques de texture par codages des extrema de gris et traitement prétopologique des images, 1997.

S. Bonnevay, Pretopological operators for gray-level image analysis, Stud. Inform. Univ, vol.7, issue.1, pp.173-195, 2009.
URL : https://hal.archives-ouvertes.fr/hal-00515396

S. Bonnevay, M. Lamure, C. Largeron-leténo, and N. Nicoloyannis, A pretopological approach for structuring data in non-metric spaces, Electronic Notes in Discrete Mathematics, vol.2, pp.1-9, 1999.
DOI : 10.1016/s1571-0653(04)00011-3

S. Bonnevay and C. Largeron, Data analysis based on minimal closed subsets, The international federation of Classification Societies, pp.303-308, 2000.
DOI : 10.1007/978-3-642-59789-3_48

M. Bouayad, Prétopologie et reconnaissance des formes, INSA, 1998.

Q. V. Bui,

M. Boubou, Contribution aux méthodes de classification non supervisée via des approches prétopologiques et d'agrégation d'opinions. phdthesis, 2007.

R. L. Breiger, K. M. Carley, and P. Pattison, Dynamic Social Network Modeling and Analysis: workshop summary and papers, 2003.

L. Breiman, Bagging predictors, Machine Learning, vol.24, pp.123-140, 1996.

L. Breiman, Random Forests, Machine Learning, vol.45, pp.5-32, 2001.

M. Brissaud, Les espaces prétopologiques. Compte-rendu de l'Académie des Sciences, vol.280, pp.705-708, 1975.

M. Brissaud, Espaces prétopologiques généralisés et application: Connexités, Compacité, Espaces préférenciés généraux, URA 394, 1986.

M. Brissaud, Analyse prétopologique du recouvrement d'un référentiel. Connexités et point fixe, XXIIIe colloque Structures économiques et économétrie, 1991.

M. Brissaud, Adhérence et acceptabilité multicritères. Analyse prétopologique, XXIVème colloque Structures économiques et économétrie, 1992.

M. Brissaud, Retour sur les origines de la prétopologie, Stud. Inform. Univ, vol.7, issue.1, pp.5-23, 2009.

M. Brissaud, J. Auray, G. Duru, M. Lamure, and C. Siani, Eléments de pré-topologie généralisée, Stud. Inform. Univ, vol.7, issue.1, pp.45-77, 2009.

M. Bui, S. B. Amor, M. Lamure, and C. Basileu, Gesture Trajectories Modeling Using Quasipseudometrics and Pre-topology for Its Evaluation, Information Processing and Management of Uncertainty in Knowledge-Based Systems -15th International Conference, vol.443, pp.116-134, 2014.

Q. V. Bui, S. B. Amor, and M. Bui, Stochastic Pretopology as a Tool for Topological Analysis of Complex Systems, Intelligent Information and Database Systems -10th Asian Conference, ACIIDS 2018, Dong Hoi City, vol.10752, pp.102-111, 2018.

Q. V. Bui, K. Sayadi, S. B. Amor, and M. Bui, Combining Latent Dirichlet Allocation and K-Means for Documents Clustering: Effect of Probabilistic Based Distance Measures, Intelligent Information and Database Systems -9th Asian Conference, pp.248-257, 2017.

Q. V. Bui, K. Sayadi, and M. Bui, A multi-criteria document clustering method based on topic modeling and pseudoclosure function, Proceedings of the Sixth International Symposium on Information and Communication Technology, pp.38-45, 2015.

Q. V. Bui, K. Sayadi, and M. Bui, A Multi-Criteria Document Clustering Method Based on Topic Modeling and Pseudoclosure Function, Informatica, vol.40, issue.2, pp.169-180, 2016.

W. Buntine, Estimating Likelihoods for Topic Models, Proceedings of the 1st Asian Conference on Machine Learning: Advances in Machine Learning, ACML '09, pp.51-64, 2009.

D. Buscaldi, G. Dias, V. Levorato, and C. Largeron, QASSIT: A Pretopological Framework for the Automatic Construction of Lexical Taxonomies from Raw Texts, Proceedings of the 9th International Workshop on Semantic Evaluation, SemEval@NAACL-HLT 2015, pp.955-959, 2015.
URL : https://hal.archives-ouvertes.fr/hal-01144344

F. M. Cardoso, S. Meloni, A. Santanche, and Y. Moreno, Topical homophily in online social systems, 2017.

K. M. Carley, M. K. Martin, and B. R. Hirshman, The etiology of social change, Topics in Cognitive Science, vol.1, issue.4, pp.621-650, 2009.

E. Cech, Topological Spaces, 1966.

S. Cha, Comprehensive survey on distance/similarity measures between probability density functions, INTERNATIONAL JOURNAL OF MATHEMATICAL MODELS AND METHODS IN APPLIED SCIENCES, vol.1, issue.4, pp.300-307, 2007.

G. Cleuziou, D. Buscaldi, V. Levorato, and G. Dias, A pretopological framework for the automatic construction of lexical-semantic structures from texts, Proceedings of the 20th ACM Conference on Information and Knowledge Management, pp.2453-2456, 2011.
URL : https://hal.archives-ouvertes.fr/hal-00825232

A. Criminisi, J. Shotton, and E. Konukoglu, Decision Forests: A Unified Framework for Classification, Regression, Density Estimation, Manifold Learning and Semi-Supervised Learning. Foundations and Trends\textbackslashr m in Computer Graphics and Vision, vol.7, pp.81-227, 2012.
DOI : 10.1561/0600000035

I. Csiszár, Eine informationstheoretische Ungleichung und ihre Anwendung auf den Beweis der Ergodizitat von Markoffschen Ketten, Magyar. Tud. Akad. Mat. Kutato Int. Kozl, vol.8, pp.85-108, 1963.

M. Dalud-vincent, Modèle Prétopologique pour une méthodologie d'analyse de réseaux: concepts et algorithmes, 1994.

M. Dalud-vincent, M. Brissaud, and M. Lamure, Pretopology as an extension of graph theory : the case of strong connectivity, International Journal of Applied Mathematics, vol.5, issue.4, pp.455-472, 2001.
URL : https://hal.archives-ouvertes.fr/halshs-00470194

M. Dalud-vincent, M. Brissaud, and M. Lamure, Closed sets and closures in pretopology, International Journal of Pure and Applied Mathematics, pp.391-402, 2009.
URL : https://hal.archives-ouvertes.fr/halshs-01778084

M. Dalud-vincent, M. Brissaud, and M. Lamure, Connectivities and Partitions in a Pretopological Space, International Mathematical Forum, vol.6, issue.45, pp.2201-2215, 2011.
URL : https://hal.archives-ouvertes.fr/halshs-00610696

Q. V. Bui,

M. Dalud-vincent, M. Brissaud, M. Lamure, and R. G. Paradin, Pretopology, matroïdes and hypergraphs, International Journal of Pure and Applied Mathematics, vol.67, issue.4, pp.363-375, 2011.

R. Dapoigny, M. Lamure, and N. Nicoloyannis, Pretopological Transformations of Binary Images: A Parallel Implementation, Proceedings of the Seventh IASTED/ISMM International Conference on Parallel and Distributed Computing and Systems, pp.288-291, 1995.

J. Debayle and J. Pinoli, General Adaptive Neighborhood-Based Pretopological Image Filtering, Journal of Mathematical Imaging and Vision, vol.41, issue.3, pp.210-221, 2011.
DOI : 10.1007/s10851-011-0271-5

URL : https://hal.archives-ouvertes.fr/emse-00643414

M. Deza and E. Deza, Dictionary of distances, 2006.

R. O. Duda, P. E. Hart, and D. G. Stork, Pattern Classification, 2000.

G. Duru, Nouveaux éléments de prétopologie, Faculté de Droit et des Sciences économiques de Besançon, 1977.

G. Duru, Contribution à l'étude des structures des systèmes complexes dans les Sciences Humaines, 1980.

D. Easley and J. Kleinberg, Networks Crowds and_Markets, 2010.
DOI : 10.1017/cbo9780511761942

M. Egea, Prétopologie floues. Stud. Inform. Univ, vol.7, issue.1, pp.131-171, 2009.

H. Emptoz, Modèles prétopologiques pour la reconnaissance des formes. Application en Neurophysiologie, 1983.

A. Ferligoj and V. Batagelj, Direct multicriteria clustering algorithms, Journal of Classification, vol.9, issue.1, pp.43-61, 1992.
DOI : 10.1007/bf02618467

S. Fouchal, M. Ahat, I. Lavallée, M. Bui, and S. B. Amor, Clustering Based on Kolmogorov Information, Knowledge-Based and Intelligent Information and Engineering Systems14th International Conference, KES 2010, vol.6276, pp.452-460, 2010.
URL : https://hal.archives-ouvertes.fr/hal-00643426

M. Fréchet, Espaces Abstraits. Hermann, 1928.

C. Frélicot and H. Emptoz, A Pretopological Approach for Pattern Classification with Reject Options, Advances in Pattern Recognition, Joint IAPR International Workshops SSPR '98 and SPR '98, vol.1451, pp.707-715, 1998.

C. Frélicot and F. Lebourgeois, A pretopology-based supervised pattern classifier, Fourteenth International Conference on Pattern Recognition, pp.106-109, 1998.

N. T. Gayraud, E. Pitoura, and P. Tsaparas, Diffusion Maximization in Evolving Social Networks, Proceedings of the 2015 ACM on Conference on Online Social Networks, COSN '15, pp.125-135, 2015.
DOI : 10.1145/2817946.2817965

J. Gil-aluja and A. M. Lafuente, Towards an Advanced Modelling of Complex Economic Phenomena -Pretopological and Topological Uncertainty Research Tools, Studies in Fuzziness and Soft Computing, vol.276, 2012.

M. Girvan and M. E. Newman, Community structure in social and biological networks, Proceedings of the National Academy of Sciences of the United States of America, pp.7821-7826, 2002.
DOI : 10.1073/pnas.122653799

URL : http://www.pnas.org/content/99/12/7821.full.pdf

J. Goldenberg, B. Libai, and E. Muller, Talk of the Network: A Complex Systems Look at the Underlying Process of Word-of-Mouth, Marketing Letters, vol.12, issue.3, pp.211-223, 2001.

A. Gordon, Classification, 2nd Edition. Chapman & Hall/CRC Monographs on Statistics & Applied Probability, 1999.
DOI : 10.1201/9781584888536

M. Grandjean, A social network analysis of Twitter: Mapping the digital humanities community, Cogent Arts & Humanities, vol.3, issue.1, p.1171458, 2016.
URL : https://hal.archives-ouvertes.fr/hal-01517493

M. Granovetter, Threshold Models of Collective Behavior, American Journal of Sociology, vol.83, issue.6, pp.1420-1443, 1978.
DOI : 10.1086/226707

T. L. Griffiths and M. Steyvers, Finding scientific topics, Proceedings of the National academy of Sciences of the United States of America, vol.101, pp.5228-5235, 2004.
DOI : 10.1073/pnas.0307752101

URL : http://www.pnas.org/content/101/suppl_1/5228.full.pdf

A. Guille, H. Hacid, C. Favre, and D. A. Zighed, Information diffusion in online social networks: a survey, ACM SIGMOD Record, vol.42, issue.1, p.17, 2013.
URL : https://hal.archives-ouvertes.fr/hal-00848050

N. J. Gunther, A General Theory of Computational Scalability Based on Rational Functions, 2008.

N. J. Gunther, P. Puglia, and K. Tomasette, Hadoop superlinear scalability. Communications of the ACM, vol.58, pp.46-55, 2015.
DOI : 10.1145/2719919

G. Guérard, S. B. Amor, and A. Bui, A Context-free Smart Grid Model using Pretopologic Structure, SMARTGREENS 2015 -Proceedings of the 4th International Conference on Smart Cities and Green ICT Systems, pp.335-341, 2015.

G. Heinrich, Parameter estimation for text analysis, 2004.

K. T. Ho, Q. V. Bui, and M. Bui, Dynamic Social Network Analysis Using AuthorTopic Model, Innovations for Community Services -18th International Conference, I4CS 2018, vol.863, pp.47-62, 2018.

T. K. Ho, Random decision forests, Proceedings of the Third International Conference on Document Analysis and Recognition, vol.1, pp.278-282, 1995.

T. K. Ho, Q. V. Bui, and M. Bui, Homophily Independent Cascade Diffusion Model Based On Textual Information, 10th International Conference on Computational Collective Intelligence (ICCCI 2018), 2018.
DOI : 10.1007/978-3-319-98443-8_13

M. Hoffman, F. R. Bach, and D. M. Blei, Online Learning for Latent Dirichlet Allocation, Advances in Neural Information Processing Systems 23, pp.856-864, 2010.

M. D. Hoffman, D. M. Blei, C. Wang, and J. Paisley, Stochastic Variational Inference, J. Mach. Learn. Res, vol.14, issue.1, pp.1303-1347, 2013.

T. Hofmann, Probabilistic Latent Semantic Indexing, Proceedings of the 22Nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR '99, pp.50-57, 1999.
DOI : 10.1145/3130348.3130370

P. Holme and J. Saramäki, Temporal Networks, Physics Reports, vol.519, issue.3, pp.97-125, 2012.

A. Huang, Similarity measures for text document clustering, Proceedings of the sixth new zealand computer science research student conference (NZCSRSC2008), pp.49-56, 2008.

L. Hubert and P. Arabie, Comparing partitions, Journal of classification, vol.2, issue.1, pp.193-218, 1985.
DOI : 10.1007/bf01908075

M. S. Karypis, V. Kumar, and M. Steinbach, A comparison of document clustering techniques, KDD workshop on Text Mining, 2000.

D. Kempe, J. Kleinberg, and E. Tardos, Maximizing the Spread of Influence Through a Social Network, Proceedings of the Ninth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp.137-146, 2003.

D. Kempe, J. M. Kleinberg, and E. Tardos, Influential Nodes in a Diffusion Model for Social Networks, Automata, Languages and Programming, 32nd International Colloquium, ICALP 2005, pp.1127-1138, 2005.

M. Kimura and K. Saito, Tractable Models for Information Diffusion in Social Networks, Knowledge Discovery in Databases: PKDD 2006, pp.259-271, 2006.
DOI : 10.1007/11871637_27

URL : https://link.springer.com/content/pdf/10.1007%2F11871637_27.pdf

M. Klassen and N. Paturi, Web document classification by keywords using random forests, Networked Digital Technologies, pp.256-261, 2010.
DOI : 10.1007/978-3-642-14306-9_26

E. M. Kleinberg, An overtraining-resistant stochastic modeling method for pattern recognition, The Annals of Statistics, vol.24, issue.6, pp.2319-2349, 1996.
DOI : 10.1214/aos/1032181157

URL : https://doi.org/10.1214/aos/1032181157

D. Koller and N. Friedman, Probabilistic Graphical Models: Principles and Techniques -Adaptive Computation and Machine Learning, 2009.

S. Kullback and R. A. Leibler, On Information and Sufficiency, The Annals of Mathematical Statistics, vol.22, issue.1, pp.79-86, 1951.
DOI : 10.1214/aoms/1177729694

URL : https://doi.org/10.1214/aoms/1177729694

K. Kuratowski, Topologie. Nak?. Polskiego Towarzystwa Matematycznego, p.3014396, 1952.

Q. V. Bui,

M. Lamure, Espaces abstraits et reconnaissance des formes, 1987.

M. Lamure, S. Bonnevay, M. Bui, and S. B. Amor, A Stochastic and Pretopological Modeling Aerial Pollution of an Urban Area, Stud. Inform. Univ, vol.7, issue.3, pp.410-426, 2009.
URL : https://hal.archives-ouvertes.fr/halshs-00642878

M. Lamure and J. J. Milan, A System of Image Analysis Based on a Pretopological Approach, Intelligent Autonomous Systems, An International Conference, pp.340-345, 1986.

T. K. Landauer and S. T. Dutnais, A solution to Platoâs problem: The latent semantic analysis theory of acquisition, induction, and representation of knowledge, Psychological review, pp.211-240, 1997.

D. Laniado, Y. Volkovich, K. Kappler, and A. Kaltenbrunner, Gender homophily in online dyadic and triadic relationships, EPJ Data Science, vol.5, issue.1, p.19, 2016.
DOI : 10.1140/epjds/s13688-016-0080-6

URL : https://epjdatascience.springeropen.com/track/pdf/10.1140/epjds/s13688-016-0080-6

C. Largeron and S. Bonnevay, A pretopological approach for structural analysis, Information Sciences, vol.144, issue.1-4, pp.169-185, 2002.

P. F. Lazarsfeld and R. K. Merton, Friendship as a social process: A substantive and methodological analysis. Freedom and control in modern society, vol.18, pp.18-66, 1954.

T. V. Le, Classification prétopologique des données : application à l'analyse des trajectoires patients, 2007.

T. V. Le, N. Kabachi, and M. Lamure, A clustering method associating pretopological concepts and k-means algorithm, Proceedings of the International Conference on Research Innovation and Vision for the Future, 2007.

T. V. Le, T. N. Truong, H. N. Nguyen, and T. V. Pham, An Efficient Pretopological Approach for Document Clustering, Proceedings of the 2013 5th International Conference on Intelligent Networking and Collaborative Systems, INCOS '13, pp.114-120, 2013.

F. Lebourgeois, M. Bouayad, and H. Emptoz, Structure Relation between Classes for Supervised Learning using Pretopology, Fifth International Conference on Document Analysis and Recognition, pp.33-36, 1999.

F. Lebourgeois and H. Emptoz, Pretopological approach for supervised learning, 13th International Conference on Pattern Recognition, pp.256-260, 1996.

R. O. Legendi and L. Gulyás, Agent-Based Dynamic Network Models: Validation on Empirical Data, Advances in Social Simulation, Advances in Intelligent Systems and Computing, pp.49-60, 2014.

V. Levorato, Contributions à la Modélisation des Réseaux Complexes : Prétopolo-gie et Applications. (Contributions to the Modeling of Complex Networks: Pretopology and Applications), 2008.

Q. V. Bui,

V. Levorato, Modeling Groups In Social Networks, 25th European Conference on Modelling and Simulation, pp.129-134, 2011.
URL : https://hal.archives-ouvertes.fr/hal-00605395

V. Levorato and S. B. Amor, PretopoLib : la librairie JAVA de la prétopologie, Extraction et gestion des connaissances (EGC'2010), Actes, 26 au 29 janvier 2010, Hammamet, Tunisie, volume RNTI-E-19 of Revue des Nouvelles Technologies de l'Information, pp.643-644, 2010.

V. Levorato and M. Bui, Modeling the Complex Dynamics of Distributed Communities of the Web with Pretopology, 10th International Conference on Innovative Internet Community Services (I2CS '07), pp.306-320, 2007.
URL : https://hal.archives-ouvertes.fr/hal-00542262

V. Levorato and M. Bui, Data Structures and Algorithms for Pretopology: the JAVA based software library PretopoLib, IEEE, editor, (I2CS), pp.122-134, 2008.
URL : https://hal.archives-ouvertes.fr/hal-01123654

V. Levorato, T. V. Le, M. Lamure, and M. Bui, Classification prétopologique basée sur la complexité de, Kolmogorov. Stud. Inform. Univ, vol.7, issue.1, pp.197-222, 2009.

V. Levorato and C. Petermann, Detection of communities in directed networks based on strongly p-connected components, Computational Aspects of Social Networks (CASoN), 2011 International Conference on, pp.211-216, 2011.
URL : https://hal.archives-ouvertes.fr/hal-00634309

V. Levorato, M. Senot-;-h.-bordihn, R. Freund, M. Holzer, T. Hinze et al., Discrete Signal Machines via Pretopology, Second Workshop on Non-Classical Models for Automata and Applications -NCMA 2010, pp.127-140, 2010.
URL : https://hal.archives-ouvertes.fr/hal-00511950

J. Lin, Divergence measures based on the Shannon entropy, IEEE Transactions on Information theory, vol.37, issue.1, pp.145-151, 1991.

Y. S. Lin, J. Y. Jiang, and S. J. Lee, A Similarity Measure for Text Classification and Clustering, IEEE Transactions on Knowledge and Data Engineering, vol.26, issue.7, pp.1575-1590, 2014.

D. Liparas, Y. Hacohen-kerner, A. Moumtzidou, S. Vrochidis, and I. Kompatsiaris, News Articles Classification Using Random Forests and Weighted Multimodal Features, Multidisciplinary Information Retrieval, pp.63-75, 2014.

J. Looman and J. Campbell, Adaptation of Sorensen's K (1948) for Estimating Unit Affinities in Prairie Vegetation, Ecology, vol.41, issue.3, pp.409-416, 1960.

Y. Lu, Q. Mei, and C. Zhai, Investigating task performance of probabilistic topic models: an empirical study of PLSA and LDA, Information Retrieval, vol.14, issue.2, pp.178-203, 2010.

Y. Lu, S. Okada, and K. Nitta, Semi-supervised Latent Dirichlet Allocation for Multi-label Text Classification, Recent Trends in Applied Artificial Intelligence, number 7906 in Lecture Notes in Computer Science, pp.351-360, 2013.
DOI : 10.1007/978-3-642-38577-3_36

D. M. Blei, A. Y. Ng, and M. Jordan, Latent Dirichlet Allocation, The Journal of Machine Learning Research, pp.601-608, 2001.

Q. V. Bui, , vol.145

K. Maher and M. S. Joshi, Effectiveness of Different Similarity Measures for Text Classification and Clustering, International Journal of Computer Science and Information Technologies, vol.7, issue.4, pp.1715-1720, 2016.

D. Mammass, S. Djeziri, and F. Nouboud, A Pretopological Approach for Image Segmentation and Edge Detection, Journal of Mathematical Imaging and Vision, vol.15, issue.3, pp.169-179, 2001.

D. Mammass, M. E. Yassa, F. Nouboud, and A. Chalifour, A Multicriterion Pretopological Approach for Image Segmentation, 3 rd International Conference: Sciences of Electronic Technologies of Information and Telecommunications (SETIT 2005), 2005.

C. D. Manning and P. Raghavan, An Introduction to Information Retrieval, 2009.

A. K. Mccallum, MALLET: A Machine Learning for Language Toolkit, 2002.

M. Mcpherson, L. Smith-lovin, and J. M. Cook, Birds of a Feather: Homophily in Social Networks, Annual Review of Sociology, vol.27, issue.1, pp.415-444, 2001.

A. Meziane, T. Iftene, and N. Selmaoui, Satellite image segmentation by mathematical pretopology and automatic classification, Proceedings of SPIE -The International Society for Optical Engineering, 1997.
DOI : 10.1117/12.295607

J. R. Millar, G. L. Peterson, and M. J. Mendenhall, Document Clustering and Visualization with Latent Dirichlet Allocation and Self-Organizing Maps, FLAIRS Conference, vol.21, pp.69-74, 2009.

Z. Ming, C. Luo, W. Gao, R. Han, Q. Yang et al., Bdgs: A scalable big data generator suite in big data benchmarking, Advancing Big Data Benchmarks, pp.138-154, 2014.
DOI : 10.1007/978-3-319-10596-3_11

URL : http://arxiv.org/pdf/1401.5465.pdf

D. S. Modha and W. S. Spangler, Feature weighting in k-means clustering. Machine learning, vol.52, pp.217-237, 2003.

I. Molchanov, Theory of Random Sets, 2005.
DOI : 10.1007/978-1-4471-7349-6

T. Morimoto, Markov Processes and the H-Theorem, Journal of the Physical Society of Japan, vol.18, issue.3, pp.328-331, 1963.

F. Morstatter, J. Pfeffer, H. Liu, and K. M. Carley, Is the Sample Good Enough? Comparing Data from Twitter's Streaming API with Twitter's Firehose, 2013.

J. Myllymaki, Effective web data extraction with standard XML technologies, Computer Networks, vol.39, issue.5, pp.635-644, 2002.
DOI : 10.1145/371920.372183

D. Newman, P. Smyth, M. Welling, and A. U. Asuncion, Distributed inference for latent dirichlet allocation, Advances in neural information processing systems, pp.1081-1088, 2007.

M. E. Newman, Complex systems: A survey, Am. J. Phys, vol.79, pp.800-810, 2011.

M. E. Newman, The structure and function of complex networks, Siam Review, vol.45, pp.167-256, 2003.

Q. V. Bui,

M. E. Newman, Fast algorithm for detecting community structure in networks, Physical Review E, vol.69, issue.6, 2004.

H. T. Nguyen, An Introduction to Random Sets, 2006.

M. Niazi and A. Hussain, Agent-based computing from multi-agent systems to agent-based models: a visual survey, Scientometrics, vol.89, issue.2, p.479, 2011.

N. Nicoloyannis, Structures prétopologiques et classification automatique. Le logiciel Demon, 1988.

E. Otte and R. Rousseau, Social network analysis: a powerful strategy, also for the information sciences, Journal of Information Science, vol.28, issue.6, pp.441-453, 2002.

A. S. Patil and B. Pawar, Automated classification of web sites using Naive Bayesian algorithm, Proceedings of the International MultiConference of Engineers and Computer Scientists, vol.1, 2012.

C. Petermann, S. B. Amor, and A. Bui, A pretopological multi-agent based model for an efficient and reliable Smart Grid simulation, Proceedings on the International Conference on Artificial Intelligence (ICAI), p.1, 2012.

X. Qi and B. D. Davison, Web page classification: Features and algorithms, ACM Computing Surveys, vol.41, issue.2, pp.1-31, 2009.

Z. Qiu, B. Wu, B. Wang, and L. Yu, Gibbs Collapsed Sampling for Latent Dirichlet Allocation on Spark, Journal of Machine Learning Research, pp.17-28, 2014.

M. Rafi and M. S. Shaikh, An improved semantic similarity measure for document clustering based on topic maps, 2013.

D. Ramage, D. Hall, R. Nallapati, and C. D. Manning, Labeled LDA: A supervised topic model for credit attribution in multi-labeled corpora, Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing, vol.1, pp.248-256, 2009.

L. Rokach, Ensemble-based classifiers, Artificial Intelligence Review, vol.33, issue.1-2, pp.1-39, 2009.

M. Rosen-zvi, T. Griffiths, M. Steyvers, and P. Smyth, The author-topic model for authors and documents, Proceedings of the 20th conference on Uncertainty in artificial intelligence, pp.487-494, 2004.

M. Rosen-zvi, T. L. Griffiths, M. Steyvers, and P. Smyth, The Author-Topic Model for Authors and Documents, 2012.

G. Salton, Automatic Text Processing: The Transformation Analysis and Retrieval of Information by Computer, 1988.

G. Salton and C. Buckley, Term-weighting approaches in automatic text retrieval. Information processing & management, vol.24, pp.513-523, 1988.

S. Sampson, A novitiate in a period of change: an experimental and case study of social relationships, 1968.

Q. V. Bui,

K. Sayadi, Q. V. Bui, and M. Bui, Multilayer classification of web pages using random forest and semi-supervised latent dirichlet allocation, 15th International Conference on Innovations for Community Services, I4CS 2015, pp.1-7, 2015.

K. Sayadi, Q. V. Bui, and M. Bui, Distributed implementation of the latent Dirichlet allocation on Spark, Proceedings of the Seventh Symposium on Information and Communication Technology, pp.92-98, 2016.

F. Sebastiani, Machine Learning in Automated Text Categorization. ACM Comput. Surv, vol.34, issue.1, pp.1-47, 2002.

N. Selmaoui, C. Leschi, and H. Emptoz, Crest Lines Detection in Grey Level Images: Studies of Different Approaches and Proposition of a New One, Computer Analysis of Images and Patterns, 5th International Conference, CAIP'93, vol.719, pp.157-164, 1993.

N. Selmaoui, C. Leschi, and H. Emptoz, A new approach to crest lines detection in grey level images, Acta Stereologica, 1994.

P. Shakarian, A. Bhatnagar, A. Aleali, E. Shaabani, and R. Guo, Diffusion in Social Networks, 2015.

C. E. Shannon, A mathematical theory of communication, Bell System Tech. J, vol.7, pp.623-656, 1948.

B. M. Stadler and P. F. Stadler, Basic properties of closure spaces, J. Chem. Inf. Comput. Sci, vol.42, pp.577-585, 2002.

M. Steyvers and T. Griffiths, Probabilistic topic models. Handbook of latent semantic analysis, vol.427, pp.424-440, 2007.

T. Sørensen, A method of establishing groups of equal amplitude in plant sociology based on similarity of species and its application to analyses of the vegetation on Danish commons, Biol. Skr, vol.5, pp.1-34, 1948.

I. J. Taneja, New Developments in Generalized Information Measures, Advances in Imaging and Electron Physics, vol.91, pp.37-135, 1995.

G. J. Torres, R. B. Basnet, A. H. Sung, S. Mukkamala, and B. M. Ribeiro, A similarity measure for clustering and its applications, Int J Electr Comput Syst Eng, vol.3, issue.3, pp.164-170, 2009.

N. X. Vinh, J. Epps, and J. Bailey, Information Theoretic Measures for Clusterings Comparison: Variants, Properties, Normalization and Correction for Chance, J. Mach. Learn. Res, vol.11, pp.2837-2854, 2010.

H. M. Wallach, Topic modeling: beyond bag-of-words, Proceedings of the 23rd international conference on Machine learning, pp.977-984, 2006.

H. M. Wallach, Structured topic models for language, 2008.

Q. V. Bui,

H. M. Wallach, I. Murray, R. Salakhutdinov, and D. Mimno, Evaluation methods for topic models, Proceedings of the 26th Annual International Conference on Machine Learning, pp.1105-1112, 2009.

Y. Wang, H. Bai, M. Stanton, W. Chen, and E. Y. Chang, Plda: Parallel latent dirichlet allocation for large-scale applications, Algorithmic Aspects in Information and Management, pp.301-314, 2009.

P. Xie and E. P. Xing, Integrating Document Clustering and Topic Modeling, 2013.

B. Xu, X. Guo, Y. Ye, and J. Cheng, An Improved Random Forest Classifier for Text Categorization, Journal of Computers, vol.7, issue.12, 2012.

M. Zaharia, M. Chowdhury, M. J. Franklin, S. Shenker, and I. Stoica, Spark: cluster computing with working sets, Proceedings of the 2nd USENIX conference on Hot topics in cloud computing, pp.10-10, 2010.

H. Zeng, Q. He, Z. Chen, W. Ma, and J. Ma, Learning to Cluster Web Search Results, Proceedings of the 27th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR '04, pp.210-217, 2004.

H. Zhuang, Y. Sun, J. Tang, J. Zhang, and X. Sun, Influence Maximization in Dynamic Social Networks, 2013 IEEE 13th International Conference on Data Mining, pp.1313-1318, 2013.

Q. V. Bui,