Le dictionnaire électronique des séquences nominales figées en coréen et de leurs formes fléchies - méthodes et applications

Abstract : This work aims to present methods of construction of electronic dictionaries frozen nominal sequences of Korean and their inflected forms, and to justify their validity by applying our dictionary in applied fields of automatic analysis of Korean text. To the recognition sequences nominal fixed dictionary, we have classified them into three categories according to the typographical conventions: names compact (NC), optional names fixed-width (NFF) and frozen at names required separation (NFO). As of inflected forms of fixed nominal sequences appear in Korean texts, we have built on the one hand, an electronic dictionary of 45,000 entries in NFF and on the other hand, a transducer sequences nominal postpositions with their segmentation, and finally merged these two data sets from inflectional codes associated with each input and function of bending INTEX. Our dictionary constructed from these methods has the following advantages over existing systems: 1) The dictionary of inflected forms of NFF allows the automatic recognition of all variants of space-related NFF 2) The dictionary of inflected forms NFF allows the segmentation of the inflected forms of a NFF NFF and a sequence of nominal postpositions 3) The dictionary of sequences nominal postpositions as graphs allows segmentation into their nominal postpositions 4) The dictionary of NFF is the segmentation of the sequences free nominal welded 5) The dictionary of NFF can be extended in a bilingual dictionary for machine translation 6) Each entry in the dictionary codes NFF has useful applications in the automatic processing: a semantic feature code indicating the status of name predicative head the name of each entry, the origin and the part of speech.
Document type :
Theses
Complete list of metadatas

Cited literature [97 references]  Display  Hide  Download

https://tel.archives-ouvertes.fr/tel-00627610
Contributor : Lingu Ligm <>
Submitted on : Thursday, September 29, 2011 - 10:26:48 AM
Last modification on : Wednesday, April 11, 2018 - 12:12:02 PM
Long-term archiving on : Friday, December 30, 2011 - 2:21:53 AM

File

Identifiers

  • HAL Id : tel-00627610, version 1

Citation

Sun-Mee Bae. Le dictionnaire électronique des séquences nominales figées en coréen et de leurs formes fléchies - méthodes et applications. Autre [cs.OH]. Université Paris-Est, 2002. Français. ⟨tel-00627610⟩

Share

Metrics

Record views

336

Files downloads

517