Contribution au résumé automatique multi-documents

Abstract : Professionals who have to peruse documents in a limited amount of time or private individuals who want to be informed about a specific topic without having the time to read all the texts about it both need summaries. The increase in electronic documents available have made the research in automatic summarization an important area in the field of natural language processing. We propose a method based on a sentence classification in semantic clusters, using similarity calculation between sentences. This step allows us to identify the sentences which convey the same information and to remove redundancy from the automatically generated summaries. This method has been evaluated on the "opinion summarization" task of TAC 2008 and TAC 2009 campaigns. Our system ranks itself among the first quarter of the participating systems. We also propose to integrate newswire articles structure to our summarization system in order to improve the quality of the summaries it generates. Our summarization method has also been integrated to a larger application which aims at helping the user to visualize the main topics of a corpus and to automatically extract the essential information.
Complete list of metadatas

Cited literature [125 references]  Display  Hide  Download

https://tel.archives-ouvertes.fr/tel-00573567
Contributor : Aurélien Bossard <>
Submitted on : Friday, March 4, 2011 - 10:59:56 AM
Last modification on : Wednesday, February 6, 2019 - 1:24:10 AM
Long-term archiving on : Sunday, June 5, 2011 - 2:42:13 AM

Identifiers

  • HAL Id : tel-00573567, version 1

Collections

Citation

Aurélien Bossard. Contribution au résumé automatique multi-documents. Autre [cs.OH]. Université Paris-Nord - Paris XIII, 2010. Français. ⟨tel-00573567⟩

Share

Metrics

Record views

499

Files downloads

2770