Visual search and recognition of objects, scenes and people

Josef Sivic 1
1 WILLOW - Models of visual object recognition and scene understanding
CNRS - Centre National de la Recherche Scientifique : UMR8548, Inria Paris-Rocquencourt, DI-ENS - Département d'informatique de l'École normale supérieure
Abstract : The objective of this work is to make a step towards an artificial system with human-like visual intelligence capabilities. We consider the following three visual recognition problems. First, we show how to identify the same object or scene instance in a large database of images despite significant changes in appearance due to viewpoint, illumination but also aging, seasonal changes, or depiction style. Second, we consider recognition of object classes such as "chairs" or "windows" (as opposed to a specific instance of a chair or a window). We investigate how to name object classes present in the image, identify their locations as well as predict their approximate 3D model and fine-grained style ("Is this a bar stool or a folding chair?"; "Is this a bay window or a French window?"). In particular, we investigate different levels of supervision for this task starting from just observing images without any supervision to having millions of labelled images or a set of full 3D models. Finally, we consider recognition of people and their actions in unconstrained videos such as TV or feature length films. In detail, we investigate how to identify individual people in the video using their faces ("Who is this?") as well as recognize what they do ("Is this person walking or sitting?").
Document type :
Habilitation à diriger des recherches
Complete list of metadatas

Cited literature [203 references]  Display  Hide  Download

https://tel.archives-ouvertes.fr/tel-01064559
Contributor : Minsu Cho <>
Submitted on : Tuesday, September 16, 2014 - 3:52:38 PM
Last modification on : Monday, January 28, 2019 - 9:04:25 AM
Long-term archiving on : Wednesday, December 17, 2014 - 11:30:36 AM

Identifiers

  • HAL Id : tel-01064559, version 1

Collections

Citation

Josef Sivic. Visual search and recognition of objects, scenes and people. Computer Vision and Pattern Recognition [cs.CV]. Ecole Normale Supérieure de Paris - ENS Paris, 2014. ⟨tel-01064559⟩

Share

Metrics

Record views

1004

Files downloads

1571