Content-based inference of structural grammar for recurrent TV programs from a collection of episodes

Abstract : TV program structuring raises as a major theme in last decade for the task of high quality indexing. In this thesis, we address the problem of unsupervised TV program structuring from the point of view of grammatical inference, i.e., discovering a common structural model shared by a collection of episodes of a recurrent program. Using grammatical inference makes it possible to rely on only minimal domain knowledge. In particular, we assume no prior knowledge on the structural elements that might be present in a recurrent program and very limited knowledge on the program type, e.g., to name structural elements, apart from the recurrence. With this assumption, we propose an unsupervised framework operating in two stages. The first stage aims at determining the structural elements that are relevant to the structure of a program. We address this issue making use of the property of element repetitiveness in recurrent programs, leveraging temporal density analysis to filter out irrelevant events and determine valid elements. Having discovered structural elements, the second stage is to infer a grammar of the program. We explore two inference techniques based either on multiple sequence alignment or on uniform resampling. A model of the structure is derived from the grammars and used to predict the structure of new episodes. Evaluations are performed on a selection of four different types of recurrent programs. Focusing on structural element determination, we analyze the effect on the number of determined structural elements, fixing the threshold applied on the density function as well as the size of collection of episodes. For structural grammar inference, we discuss the quality of the grammars obtained and show that they accurately reflect the structure of the program. We also demonstrate that the models obtained by grammatical inference can accurately predict the structure of unseen episodes, conducting a quantitative and comparative evaluation of the two methods by segmenting the new episodes into their structural components. Finally, considering the limitations of our work, we discuss a number of open issues in structure discovery and propose three new research directions to address in future work.
Complete list of metadatas

Cited literature [79 references]  Display  Hide  Download

https://tel.archives-ouvertes.fr/tel-01337252
Contributor : Abes Star <>
Submitted on : Friday, June 24, 2016 - 5:02:08 PM
Last modification on : Friday, January 11, 2019 - 2:27:04 PM
Long-term archiving on : Sunday, September 25, 2016 - 12:48:28 PM

File

QU_Bingqing.pdf
Version validated by the jury (STAR)

Identifiers

  • HAL Id : tel-01337252, version 1

Citation

Bingqing Qu. Content-based inference of structural grammar for recurrent TV programs from a collection of episodes. Formal Languages and Automata Theory [cs.FL]. Université Rennes 1, 2015. English. ⟨NNT : 2015REN1S139⟩. ⟨tel-01337252⟩

Share

Metrics

Record views

427

Files downloads

166