Skip to Main content Skip to Navigation
Theses

Continuous top-k queries over real-time web streams

Despoina Vouzoukidou 1
1 BD - Bases de Données
LIP6 - Laboratoire d'Informatique de Paris 6
Abstract : In this thesis, we are interested in efficient evaluation techniques of continuous top-k queries over text and feedback streams featuring generalized scoring functions which capture dynamic ranking aspects. As a first contribution, we generalize state of the art continuous top-k query models, by introducing a general family of non-homogeneous scoring functions combining query-independent item importance with query-dependent content relevance and continuous score decay reflecting information freshness. Our second contribution consists in the definition and implementation of efficient in-memory data structures for indexing and evaluating this new family of continuous top-k queries. Our experiments show that our solution is scalable and outperforms other existing state of the art solutions, when restricted to homogeneous functions. Going a step further, in the second part of this thesis we consider the problem of incorporating dynamic feedback signals to the original scoring function and propose a new general real-time query evaluation framework with a family of new algorithms for efficiently processing continuous top-k queries with dynamic feedback scores in a real-time web context. Finally, putting together the outcomes of these works, we present MeowsReader, a real-time news ranking and filtering prototype which illustrates how a general class of continuous top-k queries offers a suitable abstraction for modelling and implementing continuous online information filtering applications combining keyword search and real-time web activity.
Complete list of metadata

Cited literature [71 references]  Display  Hide  Download

https://tel.archives-ouvertes.fr/tel-01366673
Contributor : Abes Star :  Contact Connect in order to contact the contributor
Submitted on : Thursday, September 15, 2016 - 10:30:07 AM
Last modification on : Friday, January 8, 2021 - 5:32:09 PM
Long-term archiving on: : Friday, December 16, 2016 - 1:15:31 PM

File

2015PA066659.pdf
Version validated by the jury (STAR)

Identifiers

  • HAL Id : tel-01366673, version 1

Citation

Despoina Vouzoukidou. Continuous top-k queries over real-time web streams. Data Structures and Algorithms [cs.DS]. Université Pierre et Marie Curie - Paris VI, 2015. English. ⟨NNT : 2015PA066659⟩. ⟨tel-01366673⟩

Share

Metrics

Record views

429

Files downloads

1164