Skip to Main content Skip to Navigation
Theses

Approches catégoriques et non catégoriques en linguistique des corpus spécialisés, application à un système de filtrage d'information

Abstract : This thesis is set in the framework of corpus linguistic studies, centered on actual utterances, in specialised domains. By building on the theoretical and methodological grounding of data-oriented approaches in linguistics, this thesis aims at identifying and describing complex lexical units which are strongly correlated with well defined sub-topics: topical signatures. One of the achievements of this work is the description of a set of topical signatures for a sub-topic of financial news extracts: corporate financial activities. The approach described in this thesis, building on classical distributional methods, also aims at evaluating non categorical and non logic-centered approaches, such as stochastic ones, in the process of identifying topical signatures. The extracted signatures serve as lexical resources, put to use by a selective information dissemination system: CORAIL. This system is the achievement of an industrial research project, funded by the French Ministry for Research and Industry.
Document type :
Theses
Complete list of metadata

https://tel.archives-ouvertes.fr/tel-00002847
Contributor : Antonio Balvet <>
Submitted on : Tuesday, May 20, 2003 - 12:16:09 PM
Last modification on : Tuesday, March 2, 2021 - 9:57:08 AM
Long-term archiving on: : Friday, April 2, 2010 - 10:55:49 PM

Identifiers

  • HAL Id : tel-00002847, version 1

Citation

Antonio Balvet. Approches catégoriques et non catégoriques en linguistique des corpus spécialisés, application à un système de filtrage d'information. Sciences de l'Homme et Société. Université de Nanterre - Paris X, 2002. Français. ⟨tel-00002847⟩

Share

Metrics

Record views

332

Files downloads

6119