Vers l'intégration de post-éditions d'utilisateurs pour améliorer les systèmes de traduction automatiques probabilistes

Abstract : Nowadays, machine translation technologies are seen as a promising approach to help produce low cost translations. However, the current state of the art does not allow the full automation of the process and human intervention remains essential to produce high quality results. To ensure translation quality, system's results are commonly post-edited : the outputs are manually checked and, if necessary, corrected by the user. This user's post-editing work can be a valuable source of data for systems analysis and improvement. Our work focuses on developing an approach able to take advantage of these users' feedbacks to improve and update a statistical machine translation (SMT) system. The conducted experiments aim to exploit a corpus of about 10,000 SMT translation hypotheses post-edited by volunteers through a crowdsourcing platform. The first experiments integrated post-editions into the translation model on the one hand, and on the system outputs by automatic post-editing on another hand, and allowed us to evaluate the complexity of the task. Our further detailed study of automatic statistical post-editions systems evaluate the usability, the benefits and limitations of the approach. We also show that the collected post-editions can be successfully used to estimate the confidence of a given result of automatic translation. The obtained results show that the use of automatic translation hypothese post-editions as a source of information is a difficult but promising way to improve the quality of current probabilistic systems.
Document type :
Theses
Liste complète des métadonnées

https://tel.archives-ouvertes.fr/tel-00995104
Contributor : Laurent Besacier <>
Submitted on : Friday, May 23, 2014 - 9:04:21 AM
Last modification on : Thursday, October 11, 2018 - 8:48:02 AM
Document(s) archivé(s) le : Saturday, August 23, 2014 - 10:47:19 AM

Identifiers

  • HAL Id : tel-00995104, version 1

Citation

Marion Potet. Vers l'intégration de post-éditions d'utilisateurs pour améliorer les systèmes de traduction automatiques probabilistes. Autre [cs.OH]. Université de Grenoble, 2013. Français. ⟨NNT : 2013GRENM011⟩. ⟨tel-00995104⟩

Share

Metrics

Record views

493

Files downloads

290