Outils et environnements pour l'amélioration incrémentale, la post-édition contributive et l'évaluation continue de systèmes de TA. Application à la TA français-chinois.

Abstract : The thesis, conducted as part of a CIFRE grant, and extending one of the aspects of the ANR project Traouiero, first addresses the production, extension and improvement of multilingual corpora by machine translation (MT) and contributory post-editing (PE). Functional and technical improvements have been made to the SECTra and iMAG software produced in previous PhD theses (P.C. Huynh, H.T. Nguyen), and progress has ben made toward a generic definition of the structure of a multilingual, annotated and multi-media corpus that may contain usual documents as well as pseudo-documents (such as Web pages) and meta-segments. This part has been validated by the creation of good French-Chinese bilingual corpora, one of them resulting from the first application to literary translation (a Jules Verne novel).A second part, initially motivated by an industrial need, has consisted in building MT systems of Moses type, specialized to sub-languages, for french↔chinese, and to study how to improve them in the context of a continuous use with the possibility of PE. As part of an internal project on the LIG website and of a project (TABE-FC) in cooperation with Xiamen University, it has been possible to demonstrate the value of incremental learning in statistical MT, under certain conditions, through an experiment that spread over the whole thesis.The third part of the thesis is devoted to contributing and making available computer tools and resources. The main ones are related to the COST project MUMIA of the EU and result from the exploitation of the CLEF-2011 collection of 1.5 million partially multilingual patents. Large translation memories have been extracted from it (17.5 million segments), 3 MT systems have been produced (de-fr, en-fr, fr-de), and a website of support for multilingual IR on patents has been constructed. One also describes the on-going implementation of JianDan-eval, a platform for building, deploying and evaluating MT systems.
Liste complète des métadonnées

Cited literature [23 references]  Display  Hide  Download

https://tel.archives-ouvertes.fr/tel-01320566
Contributor : Abes Star <>
Submitted on : Tuesday, May 24, 2016 - 10:12:06 AM
Last modification on : Thursday, October 11, 2018 - 8:48:02 AM
Document(s) archivé(s) le : Thursday, August 25, 2016 - 10:24:12 AM

File

WANG_2015_archivage.pdf
Version validated by the jury (STAR)

Identifiers

  • HAL Id : tel-01320566, version 1

Collections

Citation

Lingxiao Wang. Outils et environnements pour l'amélioration incrémentale, la post-édition contributive et l'évaluation continue de systèmes de TA. Application à la TA français-chinois.. Traitement du texte et du document. Université Grenoble Alpes, 2015. Français. ⟨NNT : 2015GREAM057⟩. ⟨tel-01320566⟩

Share

Metrics

Record views

443

Files downloads

374