Extraction d'information spatiale à partir de données textuelles non-standards

Abstract : The extraction of spatial information from textual data has become an important research topic in the field of Natural Language Processing (NLP). It meets a crucial need in the information society, in particular, to improve the efficiency of Information Retrieval (IR) systems for different applications (tourism, spatial planning, opinion analysis, etc.). Such systems require a detailed analysis of the spatial information contained in the available textual data (web pages, e-mails, tweets, SMS, etc.). However, the multitude and the variety of these data, as well as the regular emergence of new forms of writing, make difficult the automatic extraction of information from such corpora.To meet these challenges, we propose, in this thesis, new text mining approaches allowing the automatic identification of variants of spatial entities and relations from textual data of the mediated communication. These approaches are based on three main contributions that provide intelligent navigation methods. Our first contribution focuses on the problem of recognition and identification of spatial entities from short messages corpora (SMS, tweets) characterized by weakly standardized modes of writing. The second contribution is dedicated to the identification of new forms/variants of spatial relations from these specific corpora. Finally, the third contribution concerns the identification of the semantic relations associated withthe textual spatial information.
Document type :
Theses
Complete list of metadatas

Cited literature [170 references]  Display  Hide  Download

https://tel.archives-ouvertes.fr/tel-02138938
Contributor : Abes Star <>
Submitted on : Friday, May 24, 2019 - 11:44:28 AM
Last modification on : Monday, September 2, 2019 - 2:46:15 PM

File

ZENASNI_2018_archivage.pdf
Version validated by the jury (STAR)

Identifiers

  • HAL Id : tel-02138938, version 1

Collections

Citation

Sarah Zenasni. Extraction d'information spatiale à partir de données textuelles non-standards. Autre [cs.OH]. Université Montpellier, 2018. Français. ⟨NNT : 2018MONTS076⟩. ⟨tel-02138938⟩

Share

Metrics

Record views

63

Files downloads

24