Detecting and indexing moving objects for behavior analysis by video and audio interpretation

Alessia Saggese 1, 2
1 Equipe Image - Laboratoire GREYC - UMR6072
GREYC - Groupe de Recherche en Informatique, Image, Automatique et Instrumentation de Caen
Abstract : In the last decades we have assisted to a growing need for security in many public environments. This consideration has led the proliferation of cameras and microphones. However, the main limitation of this traditional audio-video surveillance systems lies in the so called psychological overcharge issue of the human operators responsible for security, that causes a decrease in their capabilities to analyse raw data flows from multiple sources of multimedia information. For the above mentioned reasons, in this thesis we propose an intelligent surveillance system able to provide images and video with a semantic interpretation, for trying to bridge the gap between their low-level representation in terms of pixels, and the high-level, natural language description that a human would give about them.In particular, the proposed framework starts by analysing the videos and by extracting the trajectories of the objects populating the scene (tracking module): it is important to underline that the trajectory is a very discriminant feature, since the movement of objects in a scene is not random, but instead have an underlying structure which can be exploited to build some models. Once extracted, this large amount of trajectories needs to be indexed and properly stored in order to improve the overall performance of the system during the retrieving step (storing and retrieval module). Furthermore, the human operator is informed as soon as an abnormal behaviour occurs (visual behaviour understanding module). Whereas the information extracted from the videos are not sufficient or not sufficiently reliable, the proposed system in enriched by a module in charge of recognizing audio events, such as shoots, screams or broken glasses (audio recognition module). It is worth pointing out that the integration between audio and video based information is a significant add-on for the proposed framework, being a completely novel aspect in the field of video and audio analysis.Each proposed module has been tested both over standard datasets and in real environments; the promising obtained results confirm the advance with respect to the state of the art, as well as the applicability of the proposed method in real scenarios.
Liste complète des métadonnées

Cited literature [136 references]  Display  Hide  Download

https://hal.archives-ouvertes.fr/tel-01082696
Contributor : Greyc Référent <>
Submitted on : Friday, November 14, 2014 - 10:18:28 AM
Last modification on : Tuesday, February 5, 2019 - 12:12:43 PM
Document(s) archivé(s) le : Friday, April 14, 2017 - 2:01:47 PM

Identifiers

  • HAL Id : tel-01082696, version 1

Citation

Alessia Saggese. Detecting and indexing moving objects for behavior analysis by video and audio interpretation. Computer Science [cs]. Université de Caen, 2014. English. ⟨tel-01082696⟩

Share

Metrics

Record views

340

Files downloads

357