Service interruption on Monday 11 July from 12:30 to 13:00: all the sites of the CCSD (HAL, EpiSciences, SciencesConf, AureHAL) will be inaccessible (network hardware connection).
Skip to Main content Skip to Navigation

Détection de textes enfouis dans des bases d'images généralistes : un descripteur sémantique pour l'indexation

Abstract : Multimedia data bases, both personal and professional, are continuously growing and the need for automatic solutions becomes mandatory. Effort devoted by the research community to content-based image indexing is also growing, but the semantic gap is difficult to cross: the low level descriptors used for indexing are not efficient enough for an ergonomic manipulation of big and generic image data bases. The text present in a scene is usually linked to image semantic context and constitutes a relevant descriptor for content-based image indexing. In this thesis we present an approach to automatic detection of text from natural scenes, which tends to handle the text in different sizes, orientations, and backgrounds. The system uses a non linear scale space based on the ultimate opening operator (a morphological numerical residue). In a first step, we study the action of this operator on real images, and propose solutions to overcome these intrinsic limitations. In a second step, the operator is used in a text detection framework which contains additionally various tools of text categorisation. The robustness of our approach is proven on two different dataset. First we took part to ImagEval evaluation campaign and our approach was ranked first in the text localisation contest. Second, we produced result (using the same framework) on the free ICDAR dataset, the results obtained are comparable with those of the state of the art. Lastly, a demonstrator was carried out for EADS. Because of confidentiality, this work could not be integrated into this manuscript.
Document type :
Complete list of metadata

Cited literature [125 references]  Display  Hide  Download
Contributor : Ecole Mines ParisTech Connect in order to contact the contributor
Submitted on : Monday, June 2, 2008 - 8:00:00 AM
Last modification on : Wednesday, November 17, 2021 - 12:27:08 PM
Long-term archiving on: : Friday, September 10, 2010 - 12:33:03 PM


  • HAL Id : pastel-00003782, version 1


Thomas Retornaz. Détection de textes enfouis dans des bases d'images généralistes : un descripteur sémantique pour l'indexation. domain_other. École Nationale Supérieure des Mines de Paris, 2007. Français. ⟨NNT : 2007ENMP1511⟩. ⟨pastel-00003782⟩



Record views


Files downloads