Skip to Main content Skip to Navigation
Theses

Une approche linguistique de l'évaluation des ressources extraites par analyse distributionnelle automatique

Abstract : In this thesis, we address the question of the evaluation of distributional thesauri from a linguistic point of view. The most current ways to evaluate distributional methods rely on the comparison with gold standards like WordNet or semantic tasks like the TOEFL test. However, these evaluation methods are quantitative and thus restrict the possibility of performing a linguistic analysis of the distributional neighbours. Our work aims at a better understanding of the distributional behaviors of words in texts through the study of distributional thesauri. First, we take a quantitative approach based on a comparison of several distributional thesauri with gold standards (the DES - a dictionary of synonyms - and JeuxDeMots - a crowdsourced lexical network). This step allowed us to have an overview of the nature of the semantic relations extracted in our distributional thesauri. In a second step, we relied on this comparison to select samples of distributional neighbours for a qualitative study. We focused on "classical" semantic relations, e.g. synonymy, antonymy, hypernymy and meronymy. We considered several protocols to compare the properties of the couples of distributional neighbours which were found in the gold standards and the others. Thus, taking into account parameters like the nature of the corpora from which were generated our distributional thesauri, we explain why some synonyms, hypernyms, etc. can be substituted in texts while others cannot. The purpose of this work is twofold. First, it questions the traditional evaluation methods, then it shows how distributional thesauri can be used for the study of semantic relations.
Document type :
Theses
Complete list of metadatas

Cited literature [222 references]  Display  Hide  Download

https://tel.archives-ouvertes.fr/tel-00937926
Contributor : Abes Star :  Contact
Submitted on : Tuesday, January 28, 2014 - 11:22:07 PM
Last modification on : Wednesday, October 14, 2020 - 3:44:04 AM
Long-term archiving on: : Sunday, April 9, 2017 - 1:58:21 AM

File

Morlane-Hondere_Francois.pdf
Version validated by the jury (STAR)

Identifiers

  • HAL Id : tel-00937926, version 1

Citation

François Morlane-Hondère. Une approche linguistique de l'évaluation des ressources extraites par analyse distributionnelle automatique. Linguistique. Université Toulouse le Mirail - Toulouse II, 2013. Français. ⟨NNT : 2013TOU20040⟩. ⟨tel-00937926⟩

Share

Metrics

Record views

844

Files downloads

3705