Towards an interactive index structuring system for content-based image retrieval in large image databases

Abstract : This thesis deals with the problem of Content-Based Image Retrieval (CBIR) on large image databases. Traditional CBIR systems generally rely on three phases : feature extraction, feature space structuring and retrieval. In this thesis, we are particularly interested in the structuring phase, which aims at organizing the visual feature descriptors of all images into an efficient data structure in order to facilitate, accelerate and improve further retrieval. The visual feature descriptor of each image is extracted from the feature extraction phase. Instead of traditional structuring methods, clustering methods which aim at organizing image descriptors into groups of similar objects (clusters), without any constraint on the cluster size, are studied. In order to reduce the “semantic gap” between high-level semantic concepts expressed by the user and the low-level features automatically extracted from the images, we propose to involve the user in the clustering phase so that he/she can interact with the system so as to improve the clustering results, and thus improve the results of further retrieval. With the aim of involving the user in the clustering phase, we propose a new interactive semi-supervised clustering model based on pairwise constraints (must-link and cannot-link) between groups of images. Firstly, images are organized into clusters by using the unsupervised clustering method BIRCH (Zhang et al., 1996). Then the user is involved into the interaction loop in order to guide the clustering process. In each interactive iteration, the user visualizes the clustering results and provide feedback to the system via our interactive interface. With some simple clicks, the user can specify the positive and/or negative images for each cluster. The user can also drag and drop images between clusters in order to change the cluster assignment of some images. Pairwise constraints are then deduced based on the user feedback as well as the neighbourhood information. By taking into account these constraints, the system re-organizes the data set, using the semi-supervised clustering proposed in this thesis. The interaction loop can be iterated until the clustering result satisfies the user. Different strategies for deducing pairwise constraints are proposed. These strategies are theoretically and experimentally analyzed. In order to avoid the subjective dependence of the clustering results on the human user, a software agent simulating the behaviour of the human user for providing feedback to the system is used in our experiments. By comparing our method with the most popular semi-supervised clustering HMRF-kmeans (Basu et al., 2004), our method gives better results.
Document type :
Theses
Complete list of metadatas

Cited literature [12 references]  Display  Hide  Download

https://tel.archives-ouvertes.fr/tel-00934842
Contributor : Hien Phuong Lai <>
Submitted on : Wednesday, January 22, 2014 - 4:17:35 PM
Last modification on : Thursday, May 17, 2018 - 3:50:42 AM
Long-term archiving on : Thursday, April 24, 2014 - 11:25:11 AM

Identifiers

  • HAL Id : tel-00934842, version 1

Collections

Citation

Hien Phuong Lai. Towards an interactive index structuring system for content-based image retrieval in large image databases. Computers and Society [cs.CY]. Université de La Rochelle, 2013. English. ⟨NNT : 2013LAROS409⟩. ⟨tel-00934842⟩

Share

Metrics

Record views

742

Files downloads

567