Algorithmes d'analyse syntaxique par grammaires lexicalisées : optimisation et traitement de l'ambiguïté

Abstract : The present work is about automatic parsing of written texts using lexicalized grammars and large coverage language resources. More specifically, we concentrated our work on three domains : algorithmic, easy development of NLP applications useful in an industrial context, and deep syntactic parsing. Concerning the first point, we implemented new algorithms for the optimisation of local grammars before their use for parsing and we propose an efficient algorithm for the application of this kind of grammar on text. Our algorithm enhance the processing of lexical and syntactic ambiguities and we show that it scales well when processing big text corpora in combination with fine grained and large coverage language resources. Concerning the second point, we were actively commited to the development of the Outilex project, a generalist linguistic platform dedicated to text processing. By its modular architecture, the platform aims to provide easy development of high level hybrid NLP applications mixing symbolic and stochastic approachs. Finally, the third part of our researchs involves the exploitation of the lexicon-grammar tables for deep syntactic parsing and the identification of predicate-arguments structures in French texts. For this purpose, we enhanced the formalism of local grammars with the addition of features structure constraints. Those constraints make possible to declaratively solve in our grammar many syntactic phenomena and to formalize the result of syntactic parsing. We present our grammar for French in its current state, which is semi-automatically generated from the lexicon-grammar tables, and we show some evaluation of its lexical and syntactic coverage.
Document type :
Theses
Complete list of metadatas

Cited literature [83 references]  Display  Hide  Download

https://tel.archives-ouvertes.fr/tel-00626253
Contributor : Guillaume Blin <>
Submitted on : Saturday, September 24, 2011 - 12:25:18 PM
Last modification on : Wednesday, April 11, 2018 - 12:12:02 PM
Long-term archiving on : Sunday, December 25, 2011 - 2:20:37 AM

Files

Identifiers

  • HAL Id : tel-00626253, version 1

Citation

Olivier Blanc. Algorithmes d'analyse syntaxique par grammaires lexicalisées : optimisation et traitement de l'ambiguïté. Autre [cs.OH]. Université Paris-Est, 2006. Français. ⟨tel-00626253⟩

Share

Metrics

Record views

942

Files downloads

6221