Structuration multimodale des vidéos de sport par modèles stochastiques

Ewa Kijak 1
1 TEXMEX - Multimedia content-based indexing
IRISA - Institut de Recherche en Informatique et Systèmes Aléatoires, Inria Rennes – Bretagne Atlantique
Abstract : This thesis is concerned with the structure analysis of sports videos using both audio and visual cues. The proposed method relies on a statistical model which takes into account both the shot content and the interleaving of shots. This stochastic modeling is performed in the global framework of Hidden Markov Models (HMMs) that can be efficiently applied to integrate prior information about video content and editing rules, and to merge audio and visual cues. Visual features are used to characterize the type of shot view. Audio features describe the audio events within a video shot. Our approach is validated in the particular domain of tennis videos, that present a hierarchical, complex and well-defined structure. The video structure parsing relies on the analysis of the temporal interleaving of video shots. Typical tennis scenes are simultaneously segmented and identified. In addition, each shot is assigned to a level in the hierarchy described in terms of point, game and set. As a result, the overall structure is identified. This can be used for video abstracting non-linear browsing of the document.
Document type :
Complete list of metadatas
Contributor : Patrick Gros <>
Submitted on : Thursday, November 4, 2010 - 5:34:46 PM
Last modification on : Friday, November 16, 2018 - 1:30:29 AM
Long-term archiving on : Saturday, February 5, 2011 - 2:32:32 AM


  • HAL Id : tel-00532944, version 1


Ewa Kijak. Structuration multimodale des vidéos de sport par modèles stochastiques. Interface homme-machine [cs.HC]. Université Rennes 1, 2003. Français. ⟨tel-00532944⟩



Record views


Files downloads