Détection de textes enfouis dans des bases d'images généralistes : un descripteur sémantique pour l'indexation

Abstract : Multimedia data bases, both personal and professional, are continuously growing and the need for automatic solutions becomes mandatory. Effort devoted by the research community to content-based image indexing is also growing, but the semantic gap is difficult to cross: the low level descriptors used for indexing are not efficient enough for an ergonomic manipulation of big and generic image data bases. The text present in a scene is usually linked to image semantic context and constitutes a relevant descriptor for content-based image indexing. In this thesis we present an approach to automatic detection of text from natural scenes, which tends to handle the text in different sizes, orientations, and backgrounds. The system uses a non linear scale space based on the ultimate opening operator (a morphological numerical residue). In a first step, we study the action of this operator on real images, and propose solutions to overcome these intrinsic limitations. In a second step, the operator is used in a text detection framework which contains additionally various tools of text categorisation. The robustness of our approach is proven on two different dataset. First we took part to ImagEval evaluation campaign and our approach was ranked first in the text localisation contest. Second, we produced result (using the same framework) on the free ICDAR dataset, the results obtained are comparable with those of the state of the art. Lastly, a demonstrator was carried out for EADS. Because of confidentiality, this work could not be integrated into this manuscript.
Document type :
Theses
Complete list of metadatas

Cited literature [125 references]  Display  Hide  Download

https://pastel.archives-ouvertes.fr/pastel-00003782
Contributor : Ecole Mines Paristech <>
Submitted on : Monday, June 2, 2008 - 8:00:00 AM
Last modification on : Monday, November 12, 2018 - 10:57:02 AM
Long-term archiving on : Friday, September 10, 2010 - 12:33:03 PM

Identifiers

  • HAL Id : pastel-00003782, version 1

Citation

Thomas Retornaz. Détection de textes enfouis dans des bases d'images généralistes : un descripteur sémantique pour l'indexation. domain_other. École Nationale Supérieure des Mines de Paris, 2007. Français. ⟨NNT : 2007ENMP1511⟩. ⟨pastel-00003782⟩

Share

Metrics

Record views

679

Files downloads

2945