Skip to Main content Skip to Navigation
Habilitation à diriger des recherches

Visual search and recognition of objects, scenes and people

Josef Sivic 1
1 WILLOW - Models of visual object recognition and scene understanding
CNRS - Centre National de la Recherche Scientifique : UMR8548, Inria Paris-Rocquencourt, DI-ENS - Département d'informatique de l'École normale supérieure
Abstract : The objective of this work is to make a step towards an artificial system with human-like visual intelligence capabilities. We consider the following three visual recognition problems. First, we show how to identify the same object or scene instance in a large database of images despite significant changes in appearance due to viewpoint, illumination but also aging, seasonal changes, or depiction style. Second, we consider recognition of object classes such as "chairs" or "windows" (as opposed to a specific instance of a chair or a window). We investigate how to name object classes present in the image, identify their locations as well as predict their approximate 3D model and fine-grained style ("Is this a bar stool or a folding chair?"; "Is this a bay window or a French window?"). In particular, we investigate different levels of supervision for this task starting from just observing images without any supervision to having millions of labelled images or a set of full 3D models. Finally, we consider recognition of people and their actions in unconstrained videos such as TV or feature length films. In detail, we investigate how to identify individual people in the video using their faces ("Who is this?") as well as recognize what they do ("Is this person walking or sitting?").
Complete list of metadatas

Cited literature [203 references]  Display  Hide  Download
Contributor : Minsu Cho <>
Submitted on : Tuesday, September 16, 2014 - 3:52:38 PM
Last modification on : Monday, January 28, 2019 - 9:04:25 AM
Document(s) archivé(s) le : Wednesday, December 17, 2014 - 11:30:36 AM


  • HAL Id : tel-01064559, version 1



Josef Sivic. Visual search and recognition of objects, scenes and people. Computer Vision and Pattern Recognition [cs.CV]. Ecole Normale Supérieure de Paris - ENS Paris, 2014. ⟨tel-01064559⟩



Record views


Files downloads