Skip to Main content Skip to Navigation
Theses

Segmentation en lignes de documents anciens : application aux documents arabes

Nazih Ouwayed 1
1 READ - READ
LORIA - Laboratoire Lorrain de Recherche en Informatique et ses Applications
Abstract : The indexing of handwritten scanned documents poses the problem of lines segmentation, if it fails, disabling the following steps of words extraction and recognition. In addition, the ancient Arabic documents contain annotations in the margins, often composed of lines obliquely oriented. The detection of these lines is important as the rest and is a major challenge for the indexing of these documents. Thus, the segmentation described in this thesis involves the extraction of multi-oriented lines. For this problem, the bibliography has only rudimentary techniques based essentially on the projection of the document image along one direction, which be failed in the case of multi-oriented documents. Given this lack, we have proposed an adaptive approach that first locates the different orientation zones, then based on each local orientation to extract the lines. During my thesis, i particularly invested on the following points : – Applying an automatic paving using the active contour model (snake). – Preparation the signal of the projection profile by removing all pixels that are not needed in the orientation estimation. Then, implementation of all energy distributions of Cohen's class on the projection profile to find the best distribution that gives the orientation. – Applying some extension rules to find the oriented zones. – Extraction of lines by using an connected components follow-up algorithm. – Separation of overlapped and touched lines using the morphology of Arabic terminal letters.
Document type :
Theses
Complete list of metadatas

Cited literature [63 references]  Display  Hide  Download

https://tel.archives-ouvertes.fr/tel-00495972
Contributor : Nazih Ouwayed <>
Submitted on : Tuesday, June 29, 2010 - 12:20:17 PM
Last modification on : Tuesday, April 24, 2018 - 1:54:46 PM
Long-term archiving on: : Thursday, September 30, 2010 - 5:46:20 PM

Identifiers

  • HAL Id : tel-00495972, version 1

Collections

Citation

Nazih Ouwayed. Segmentation en lignes de documents anciens : application aux documents arabes. Interface homme-machine [cs.HC]. Université Nancy II, 2010. Français. ⟨tel-00495972⟩

Share

Metrics

Record views

491

Files downloads

2660