Skip to Main content Skip to Navigation
Theses

Méthodologie linguistique et terminologique pour la structuration d'ontologies différentielles à partir de corpus textuels

Abstract : Resources like terminologies or ontologies are used in a number of applications, including documentary description and information retrieval. Different methodologies have been proposed to build such resources, on the basis of experts' interviews or of textual corpora.
This thesis focuses on the use of existing Natural Language Processing methodologies, meant to help the building of ontologies from textual corpora, to build a particular type of resource : differential ontologies. These ontologies are structured according to a system of semantic identities and differences between their constituents: terms of the domain and categorisation items called “top level categories”.
We present different experiments that we have done to elicit, structure, define and “interdefine” the terminological items relevant for a given task. Our first use case was the opales pro ject, in which we had to provide a group of anthropologists with the conceptual vocabulary that they needed to annotate audiovisual documents about childhood. We have used the textual corpus that we have built in this pro ject to test linguistic tools
and methodologies for building ontologies from textual data, and we have defined our own programs. The suite of resulting programs is called SODA, and they focus on the extraction and use of defining contexts in corpora to spot terminological items, to structure them and to provide semantic similarity information that enables to compare them.
Document type :
Theses
Complete list of metadatas

Cited literature [94 references]  Display  Hide  Download

https://tel.archives-ouvertes.fr/tel-00162575
Contributor : Claire Carpentier <>
Submitted on : Friday, July 13, 2007 - 5:30:06 PM
Last modification on : Friday, October 23, 2020 - 4:37:16 PM
Long-term archiving on: : Thursday, April 8, 2010 - 11:13:47 PM

Identifiers

  • HAL Id : tel-00162575, version 1

Citation

Véronique Malaisé. Méthodologie linguistique et terminologique pour la structuration d'ontologies différentielles à partir de corpus textuels. Linguistique. Université Paris-Diderot - Paris VII, 2005. Français. ⟨tel-00162575⟩

Share

Metrics

Record views

640

Files downloads

3992