Skip to Main content Skip to Navigation
Theses

Modélisation et recherche de graphes visuels : une approche par modèles de langue pour la reconnaissance de scènes

Trong-Ton Pham 1 
Abstract : Content-based image indexing and retrieval (CBIR) system needs to consider several types of visual features and spatial information among them (i.e., different point of views) for better image representation. This thesis presents a novel approach that exploits an extension of the language modeling approach from information retrieval to the problem of graph-based image retrieval. Such versatile graph model is needed to represent the multiple points of views of images. This graph-based framework is composed of three main stages: Image processing stage aims at extracting image regions from the image. It also consists of computing the numerical feature vectors associated with image regions. Graph modeling stage consists of two main steps. First, extracted image regions that are visually similar will be grouped into clusters using an unsupervised learning algorithm. Each cluster is then associated with a visual concept. The second step generates the spatial relations between the visual concepts. Each image is represented by a visual graph captured from a set of visual concepts and a set of spatial relations among them. Graph retrieval stage is to retrieve images relevant to a new image query. Query graphs are generated following the graph modeling stage. Inspired by the language model for text retrieval, we extend this framework for matching the query graph with the document graphs from the database. Images are then ranked based on the relevance values of the corresponding image graphs. Two instances of the visual graph model have been applied to the problem of scene recognition and robot localization. We performed the experiments on two image collections: one contained 3,849 touristic images and another composed of 3,633 images captured by a mobile robot. The achieved results show that using visual graph model outperforms the standard language model and the Support Vector Machine method by more than 10% in accuracy.
Document type :
Theses
Complete list of metadata

Cited literature [82 references]  Display  Hide  Download

https://tel.archives-ouvertes.fr/tel-00996067
Contributor : Philippe Mulhem Connect in order to contact the contributor
Submitted on : Monday, May 26, 2014 - 11:21:46 AM
Last modification on : Wednesday, July 6, 2022 - 4:23:17 AM
Long-term archiving on: : Tuesday, August 26, 2014 - 10:47:03 AM

Identifiers

  • HAL Id : tel-00996067, version 1

Collections

Citation

Trong-Ton Pham. Modélisation et recherche de graphes visuels : une approche par modèles de langue pour la reconnaissance de scènes. Information Retrieval [cs.IR]. Université de Grenoble, 2010. English. ⟨tel-00996067⟩

Share

Metrics

Record views

336

Files downloads

191