Skip to Main content Skip to Navigation
Theses

Aggregated search in Distributed Graph Databases

Abstract : In this research, we are interested in investigating issues related to query evaluation and optimization in the framework of aggregated search. Aggregated search is a new paradigm to access massively distributed information. It aims to produce answers to queries by combining fragments of information from different sources. The queries search for objects (documents) that do not exist as such in the targeted sources, but are built from fragments extracted from the different sources. The sources might not be specified in the query expression, they are dynamically discovered at runtime. In our work, we consider data dependencies to propose a framework for optimizing query evaluation over distributed graph-oriented data sources. For this purpose, we propose an approach for the document indexing/orgranizing process of aggregated search systems. We consider information retrieval systems that are graph oriented (RDF graphs). Using graph relationships, our work is within relational aggregated search where relationships are used to aggregate fragments of information. Our goal is to optimize the access to source of information in a aggregated search system. These sources contain fragments of information that are relevant partially for the query. We aim at minimizing the number of sources to ask, also at maximizing the aggregation operations within a same source. For this, we propose to reorganize the graph database(s) in partitions, dedicated to aggregated queries. We use a semantic or strucutral clustering of RDF predicates. For structural clustering, we propose to use frequent subgraph mining algorithms, we performed for this, a comparative study of their performances. For semantic clustering, we use the descriptive metadata of RDF predicates and apply semantic textual similarity methods to calculate their relatedness. Following the clustering, we define query decomposing rules based on the semantic/structural aspects of RDF predicates
Complete list of metadatas

https://tel.archives-ouvertes.fr/tel-02520460
Contributor : Abes Star :  Contact
Submitted on : Tuesday, March 31, 2020 - 9:12:10 AM
Last modification on : Wednesday, April 1, 2020 - 1:50:55 AM

File

TH2019AYEDRIHAB.pdf
Version validated by the jury (STAR)

Identifiers

  • HAL Id : tel-02520460, version 2

Citation

Rihab Ayed. Aggregated search in Distributed Graph Databases. Databases [cs.DB]. Université de Lyon; Université de Carthage (Tunisie), 2019. English. ⟨NNT : 2019LYSE1305⟩. ⟨tel-02520460v2⟩

Share

Metrics

Record views

72

Files downloads

75