, Classification sémantique des documents (CBO)

, Classification en utilisant les classifieurs conventionnels

, Similarité sémantique des documents

. Sac-de-mots,

. N-grammes,

;. K. Références and . Ahmad-;-alemzadeh, An Efficient Method for Tagging a Query with Category Labels Using Wikipedia towards Enhancing Search Engine Results, 2010 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology, pp.192-195, 1996.

. Alvarez, Word Pairs in Language Modeling for Information Retrieval, Arts et sciences du numérique : Ingénierie des connaissances et critique de la raison computationnelle. Mémoire d'Habilitation à Diriger des Recherches, vol.25, pp.125211-125212, 2000.

. Basile, A plagiarism detection procedure in three steps : selection, matches and squares. 3rd Workshop on Uncovering Plagiarism, 2009.

[. Baziz, Conceptual Indexing Based on Document Content Representation, Proceeding of the 5th international conference on Context : conceptions of Library and Information Sciences, pp.171-186, 2005.

[. Baziz, Semantic Analysis in the Mikrokosmos Machine Translation Project, Proceedings of the 4 th Conference of the European Society for Fuzzy Logic and Technology and the 11ème Eleventh Rencontres Francophones sur la Logique Floue et ses Applications (Eusflat-LFA 2005 joint Conference), pp.297-307, 1995.

[. Beyer, Sur l'évaluation de la quantité d'information d'un concept dans une taxonomie et la proposition de nouvelles mesures, Proceedings of the Fourteenth International Conference on Computational Linguistics-COLING 92, vol.25, pp.131-151, 1992.

;. E. Brill and . Brill, Unsupervised learning of disambiguation rules for part of speech tagging, Natural Language Processing Using Very Large Corpora, pp.1-13, 1995.

[. Brin, Explorations in context space : Words, sentences, discourse, Proceedings of the 1995 ACM SIGMOD International Conference on Management of Data, vol.16, pp.211-257, 1990.

[. Carmel, Looking for Needles in a Haystack or Locating Interesting Collocationnal Expressions in a Large Textual Databases, Proceedings of the 26th annual international ACM SIGIR conference on Research and development in information retrieval, vol.16, pp.22-29, 1988.

[. Cimano, Towards the self-annotating web, Proceedings of the 13th international conference on World Wide Web, pp.462-471, 2004.

V. Claveau and ;. Coyaud, Acquisition automatique de lexiques sémantiques pour la recherche d'information, Journal of Computers and the Humanities, vol.36, issue.2, pp.223-254, 1968.

[. Cunningham, , 2011.

H. Greenwood, J. Saggion, Y. Petrak, W. Li, ;. Peters et al., Ensemble methods for automatic thesaurus extraction, Proceedings of the conference on Empirical methods in natural language processing, vol.10, pp.222-229, 2002.

;. J. Curran, ;. Curran, D. Cutting, J. Karger, J. Pedersen et al., Scatter/Gather : a cluster-based approach to browsing large document collections, Proceedings of the 15th annual international ACM SIGIR conference on Research and development in information retrieval, pp.318-329, 1992.

;. B. Daille, ;. Daille, and . Daille, Approche mixte pour l'extraction de terminologie : statistique lexicale et filtres linguistiques, Actes du colloque Informatique et langue naturelle, 1993.

. David, Experiments in Automatic Phrase Indexing for Document Retrieval : A Comparison of Syntactic and Non-syntactic methods, Proceedings of the Colloque Veille Stratégique Scientifique et Technologique (VSST 2010), vol.41, pp.1-28, 1987.

;. M. Finlayson and . Finlayson, Java Libraries for Accessing the Princeton WordNet : Comparison and Evaluation, the 7th Conference on Global WordNet (GWC), 2014.

[. Fortuna, XIRQL : a query language for information retrieval in XML documents, Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval, vol.4558, pp.172-180, 1980.

[. Gabrilovich, A New Graph-Based Approach for Document Similarity Using Concepts of Non-Rigid Shapes, proceeding of the IMMM 2017: The Seventh International Conference on Advances in Information Mining and Management, vol.2, pp.33-43, 1999.

, Automatic grammatical tagging of English, 1971.

;. T. Gruber and . Gruber, A translation approach to portable ontology specifications, Journal of Knowledge Acquisition, vol.5, issue.2, pp.199-220, 1993.

;. T. Gruber and . Gruber, Towards principles for the design of ontologies used for knowledge sharing, Proceedings of the Workshop on Basic Ontological Issues in Knowledge Sharing, IJCAI'95, vol.43, pp.907-928, 1995.

;. N. Guarino, . Guarino, and . Guo, A Weakly-supervised Approach to Argumentative Zoning of Scientific Documents, Proceedings of the 2011 conference on Empirical Methods in Natural Language Processing, pp.273-283, 1997.

, Ogmios : a scalable nlp platform for annotating large web document collections, Proceedings of Corpus Linguistics, vol.11, pp.10-18, 2007.

. Harris, Z.S. Harris. Distributional structure. Word, vol.10, issue.2-3, pp.146-162, 1954.

. Hastie, Ontologies de domaine pour la modélisation du contexte en recherche d'information, The Annals of Statistics, vol.26, issue.2, pp.451-471, 1998.

[. Hotho, Ontology-based Text Document Clustering. KI, vol.16, pp.48-54, 2002.

;. C. Howe and . Howe, RiTa : creativity support for computational literature, Proceedings of the seventh ACM conference on Creativity and cognition (C&C '09), pp.205-210, 2009.

[. Iltache, Variation terminologique : Reconnaissance et acquisition automatiques de termes et de leurs variantes en corpus. Mémoire d'habilitation à diriger des recherches en informatique fondamentale, Proceedings of the Fourteenth International Conference on Machine Learning, vol.42, pp.143-151, 1997.

;. T. Joachims, . Joachims, and . John, Text categorization with support vector machines: learning with many relevant features, Proceedings of ECML-98, 10 th European Conference on Machine Learning, pp.338-345, 1995.

[. Justeson, Technical terminology: some linguistic properties and an algorithm for identification in text, Journa of Natural Language Engineering, vol.1, issue.3, pp.637-649, 1995.

;. L. Khan, . Khan, and . Kolt, Exploiting links in WordNet hierarchy for word sense disambiguation of nouns, Proceedings of the International Conference on Advances in Computing, Communication and Control, pp.20-25, 2000.

. Kolte, WordNet: A Knowledge Source for Word Sense Disambiguation, International Journal of Recent Trends in Engineering, vol.2, pp.213-217, 2009.

;. R. Krovetz and . Krovetz-;-lavie, Using corpus statistics and WordNet relations for sense identification, Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics, vol.23, pp.147-165, 1997.

L. Lebart, A. Salem, and . Lebart, Gruninger and the PIF Working group. The PIF process interchange format and framework, Analyse statistique des données textuelles : questions ouvertes et lexicométrie. Dunod, vol.194, 1988.

;. M. Lesk and . Lesk, Automatic sense disambiguation using machine readable dictionaries: How to tell a pine cone from an ice cream cone, Proceedings of the Fifth Annual International Conference on Systems Documentation, pp.24-26, 1986.

. Lewis, A comparison of two learning algorithms for text categorization, Third Annual Symposium on Document Analysis and Information Retrieval, pp.81-93, 1994.

, Text similarity: an alternative way to search MEDLINE, Journal of Bioinformatics, vol.22, pp.2298-2304, 2006.

]. D. Lin and . Lin, An information-theoric definition of similarity, Proceedings of the 15th international conference on Machine Learning, pp.296-304, 1998.

;. D. Lin and . Lin-;-lindeberg, Automatic retrieval and clustering of similar words, Proceedings of the 17th International Conference on Computational Linguistics, vol.2, pp.281-291, 1993.

;. J. Lovins, ;. Lovins, . Luhn, and . Lukashenko, Computer-Based Plagiarism Detection Methods and Tools: An Overview, Proceeding of the 2007 International Conference on Computer Systems and Technologies -CompSysTech'07, article N° 40, vol.11, pp.159-165, 1958.

, Producing high-dimensional semantic spaces from lexical co-occurrence, Proceedings of the 17th ACM Conference on Information and Knowledge Management, vol.28, pp.509-518, 1991.

;. R. Mizoguchi, . Mizoguchi, and . Mohhebi, Texts Semantic Similarity Detection Based Graph Approach, The International Arab Journal of Information Technology, vol.1, issue.2, pp.1-69, 2003.

. Niles, Traduction française de Marc Ingham, La connaissance créatrice : dynamique de l'entreprise apprenante, Management, DeBoeck Université, Analyse discursive automatique du corpus ACL Anthology. In Actes de la 21ème conférence Traitement Automatique des Langues Naturelles, pp.2-9, 1997.

[. Osman, From senses to texts: An all-in-one graph-based approach for measuring semantic similarity, Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, vol.32, pp.35-52, 1994.

;. J. Platt, . P. Platt-;-s, M. Ponzetto, and . Strube, Fast Training of Support Vector Machines using Sequential Minimal Optimization, Advances in Kernel Methods -Support Vector Learning, vol.30, pp.181-212, 1998.

M. F. Porter and . Psyché, Apport de l'ingénierie ontologique aux environnements de formations à distance, Sciences et Technologies de l'Information et de la Communication pour l'Éducation et la Formation, ATIEF, vol.14, pp.75-81, 1980.

;. M. Quillian and . Quillian, Semantic information Processing, pp.227-270, 1968.

;. R. Quinlan, H. Quinlan-;-r.-quinlan-;-r.-rada, E. Mili, M. Bicknell, C. Blettner-;-a.-raganato et al., Neural sequence learning models for word sense disambiguation, Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, vol.1, pp.315-322, 1986.

;. F. Rastier, ;. Rastier, M. Rastier, A. Cavazza, and . Abeille, Sémantique pour l'analyse De la linguistique à l'informatique, 1994.

;. P. Resnik and . Resnik, Using information content to evaluate semantic similarity in a taxonomy, Proceedings of the 14th International Joint Conference on Artificial Intelligence, vol.1, pp.448-453, 1995.

;. P. Resnik and . Resnik-;-robetson, The Word-Space Model: Using Distributional Analysis to Represent Syntagmatic and Paradigmatic Relations between Words in High-Dimensional Vector Spaces, Semantic similarity in a taxonomy: An information based measure and its application to problems of ambiguity in natural language, vol.11, pp.357-362, 1965.

[. Schileder, Querying and ranking XML documents, Journal of the American Society for Information Science and Technology, vol.53, pp.489-503, 2002.

;. H. Schmid and . Schmid, Probabilistic part-of-speech tagging using decision trees, Proceedings of the International Conference on New Methods in Language Processing, 1994.

;. M. Schneider, ;. Schneider, and . Schütze, Techniques for improving the performance of naive bayes for text classification, Proceedings of the 6th international conference on Computational Linguistics and Intelligent Text Processing, pp.682-693, 1992.

, ACM/IEEE conference on Supercomputing, pp.787-796, 1992.

;. H. Schütz and . Schütze, Semantic plagiarism detection system using ontology mapping, Journal of Computational Linguistics: Special Issue on Word Sense Disambiguation, vol.24, pp.59-62, 1998.

;. J. Sinclair, . Sinclair, and . Singhal, A Simple k-NN Algorithm For Text Categorization, Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval, vol.19, pp.647-648, 1991.

;. Jone and . Jones, Automatic keywords classification for information retrieval, 1971.

;. Spak-jones and . Stein, Experiments in relevance weighting of search terms, Proceeding of the 29th Annual Conference of the GfKl Springer, vol.15, pp.430-437, 1979.

D. S. Landauer, S. Mcnamara, W. Dennis, and . Kintsch, Handbook of Latent Semantic Analysis. Erlbaum, 2007.

;. P. Stone, . Stone, and . Tar, Urieli and L. Tanguy. L'apport du faisceau dans l'analyse syntaxique en dépendances par transitions : études de cas avec l'analyseur Talismane, Proceedings of the 2000 Joint SIGDAT Conference on Empirical Methods in Natural Language Processing and Very Large Corpora : Held in Conjunction with the 38th Annual Meeting of the Association for Computational Linguistics, vol.16, pp.1578-1584, 1969.

E. M. Voorhees and . Wang, Multi-document summarization via sentence-level semantic analysis and symmetric matrix factorization, Research of the conceptual representing of documents based on light ontology. 9th International Conference on Fuzzy Systems and Knowledge Discovery, (FSKD), pp.307-314, 1994.

;. S. Weiss and . Weiss, Learning to disambiguate, Journal of Information Storage and Retrieval, vol.9, issue.1, pp.33-41, 1973.

[. Wu, Verb semantics and lexical selection, Proceedings of the 32nd Annual Meetings of the Associations for Computational Linguistics, pp.133-138, 1994.

;. Umemura and . Yang, A study of smoothing methods for language models applied to ad-hoc information retrieval, Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval, vol.11, pp.1328-1333, 1993.