Skip to Main content Skip to Navigation
Theses

Segmentation et classification dans les images de documents numérisés

Abstract : In this thesis, we deal with printed document images processing and analysis to automate the press reviews. The scanner output images are processed without any prior knowledge nor human intervention. Thus, to characterize them, we present a scalable analysis system for complex documents. This characterization is based on a hybrid color segmentation suited to noisy document images. The color analysis customizes text extraction algorithms to fit the local image properties. The provided color and text information is used to perform layout segmentation in press images and to compute features on the resulting blocks. These elements are classified to detect advertisements. In the second part of this thesis, we deal with a more general purpose: clusternig and classification. We present a new clustering approach, named ACPP, which is completely automated, fast and easy to use. This approach's main features are its independence of prior knowledge about the data and theoretical parameters that should be determined by the user. Color analysis, layout segmentation and the ACPP classification method are combined to create a complete processing chain for press images.
Document type :
Theses
Complete list of metadatas

Cited literature [119 references]  Display  Hide  Download

https://tel.archives-ouvertes.fr/tel-00749933
Contributor : Abes Star :  Contact
Submitted on : Thursday, November 8, 2012 - 3:57:15 PM
Last modification on : Wednesday, July 8, 2020 - 12:42:08 PM
Long-term archiving on: : Saturday, February 9, 2013 - 3:50:34 AM

File

these.pdf
Version validated by the jury (STAR)

Identifiers

  • HAL Id : tel-00749933, version 1

Citation

Asma Ouji. Segmentation et classification dans les images de documents numérisés. Autre [cs.OH]. INSA de Lyon, 2012. Français. ⟨NNT : 2012ISAL0044⟩. ⟨tel-00749933⟩

Share

Metrics

Record views

778

Files downloads

26820