Antelope, une plate-forme de TAL permettant d'extraire les sens du texte : théorie et applications de l'interface syntaxe-sémantique

Abstract : This is not an easy task to quickly design a semantic parser dedicated to a particular task. Indeed, analysis components and linguistic resources are often defined with mutually incompatible formats, which make their assembly complex. We wish to bring an operational response to this problem with the Antelope linguistic platform, whose design and implementation principles are described in this thesis. Inspired by the Meaning-Text Theory (MTT), Antelope targets a robust syntactic and semantic parsing of texts, and can handle large corpora; its goal is to enable deep understanding of various kinds of text: consumer reviews, articles from encyclopedia, HR documents, newspaper articles... To achieve this goal, Antelope integrates (i) several ready-to-use components, addressing the most common NLP tasks, which interact within a unified text analysis model; (ii) a broad-coverage multilingual semantic lexicon compiled from various sources. An integration effort of all these components provides a robust and homogeneous platform, with a syntax-semantics interface. The thesis presents the platform and compares it with other state-of-the-art projects; it highlights the best practices that should be taken to ensure that such complex software remains maintainable; it also introduces a semi-supervised approach for large-scale knowledge acquisition.
Complete list of metadatas


https://tel.archives-ouvertes.fr/tel-00803531
Contributor : François-Régis Chaumartin <>
Submitted on : Friday, March 22, 2013 - 10:58:43 AM
Last modification on : Friday, January 4, 2019 - 5:33:24 PM
Long-term archiving on: Sunday, June 23, 2013 - 4:00:38 AM

Identifiers

  • HAL Id : tel-00803531, version 1

Collections

Citation

François-Régis Chaumartin. Antelope, une plate-forme de TAL permettant d'extraire les sens du texte : théorie et applications de l'interface syntaxe-sémantique. Informatique et langage [cs.CL]. Université Paris-Diderot - Paris VII, 2012. Français. ⟨NNT : PARVII 9545914/2012201101111⟩. ⟨tel-00803531⟩

Share

Metrics

Record views

1264

Files downloads

6668