Skip to Main content Skip to Navigation
Theses

Top-k search over rich web content

Abstract : Social networks are increasingly present in our everyday life and are fast becoming our primary means of information and communication. As they contain more and more data about our surrounding and ourselves, it becomes vital to access and analyze this data. Currently, the primary means to query this data is through top-k keyword search: you enter a few words and the social network service sends you back a fixed number of relevant documents. In current top-k searches in a social context the relevance of a document is evaluated based on two factors: the overlapping of the query keywords with the words of the document and the social proximity between the document and the user making the query. We argue that this is limited and propose to take into account the complex interactions between the users linked to the document, its structure and the meaning of the words it contains instead of their phrasing. To this end we highlight the requirements for a model integrating fully structured, semantic and social data and propose a new model, called S3, satisfying these requirements. We introduce querying capabilities to S3 and develop an algorithm, S3k, for customizable top-k keyword search on S3. We prove the correctness of our algorithm and propose an implementation for it. We compare this implementation with another top-k keyword search in a social context, using datasets created from real world data, and show their differences and the benefits of our approach.
Complete list of metadatas

https://tel.archives-ouvertes.fr/tel-01418124
Contributor : Abes Star :  Contact
Submitted on : Friday, December 16, 2016 - 1:05:05 PM
Last modification on : Wednesday, September 16, 2020 - 5:00:15 PM
Long-term archiving on: : Tuesday, March 21, 2017 - 4:09:31 AM

File

76235_BONAQUE_2016_diffusion.p...
Version validated by the jury (STAR)

Identifiers

  • HAL Id : tel-01418124, version 1

Citation

Raphaël Bonaque. Top-k search over rich web content. Databases [cs.DB]. Université Paris Saclay (COmUE), 2016. English. ⟨NNT : 2016SACLS291⟩. ⟨tel-01418124⟩

Share

Metrics

Record views

634

Files downloads

873