Skip to Main content Skip to Navigation
Theses

Résumé automatique multi-document dynamique

Maali Mnasri 1
1 LVIC - Laboratoire Vision et Ingénierie des Contenus
DIASI - Département Intelligence Ambiante et Systèmes Interactifs : DRT/LIST/DIASI
Abstract : This thesis focuses on text Automatic Summarization and particularly on UpdateSummarization. This research problem aims to produce a differential summary of a set of newdocuments with regard to a set of old documents assumed to be known. It thus adds two issues to thetask of generic automatic summarization: the temporal dimension of the information and the history ofthe user. In this context, the work presented here is based on an extractive approach using integerlinear programming (ILP) and is organized around two main axes: the redundancy detection betweenthe selected information and the user history and the maximization of their saliency . For the first axis,we were particularly interested in the exploitation of inter-sentence similarities to detect theredundancies between the information of the new documents and those present in the already knownones, by defining a method of semantic clustering of sentences. Concerning our second axis, westudied the impact of taking into account the discursive structure of documents, in the context of theRhetorical Structure Theory (RST), to favor the selection of information considered as the mostimportant. The benefit of the methods thus defined has been demonstrated in the context ofevaluations carried out on the data of TAC and DUC campaigns. Finally, the integration of thesesemantic and discursive criteria through a delayed fusion mechanism has proved the complementarityof these two axes and the benefit of their combination.
Document type :
Theses
Complete list of metadatas

Cited literature [167 references]  Display  Hide  Download

https://tel.archives-ouvertes.fr/tel-01902781
Contributor : Abes Star :  Contact
Submitted on : Tuesday, October 23, 2018 - 6:06:06 PM
Last modification on : Monday, February 10, 2020 - 6:13:47 PM
Long-term archiving on: : Thursday, January 24, 2019 - 6:15:57 PM

File

75763_MNASRI_2018_archivage.pd...
Version validated by the jury (STAR)

Identifiers

  • HAL Id : tel-01902781, version 1

Collections

Citation

Maali Mnasri. Résumé automatique multi-document dynamique. Traitement du texte et du document. Université Paris-Saclay, 2018. Français. ⟨NNT : 2018SACLS342⟩. ⟨tel-01902781⟩

Share

Metrics

Record views

393

Files downloads

1178