Skip to Main content Skip to Navigation

SPARQL distributed query processing over linked data

Abdoul Macina 1, 2
1 WIMMICS - Web-Instrumented Man-Machine Interactions, Communities and Semantics
CRISAM - Inria Sophia Antipolis - Méditerranée , Laboratoire I3S - SPARKS - Scalable and Pervasive softwARe and Knowledge Systems
Abstract : Driven by the Semantic Web standards, an increasing number of RDF data sources are published and connected over the Web by data providers, leading to a large distributed linked data network. However, exploiting the wealth of these data sources is very challenging for data consumers considering the data distribution, their volume growth and data sources autonomy. In the Linked Data context, federation engines allow querying these distributed data sources by relying on Distributed Query Processing (DQP) techniques. Nevertheless, a naive implementation of the DQP approach may generate a tremendous number of remote requests towards data sources and numerous intermediate results, thus leading to costly network communications. Furthermore, the distributed query semantics is often overlooked. Query expressiveness, data partitioning, and data replication are other challenges to be taken into account. To address these challenges, we first proposed in this thesis a SPARQL and RDF compliant Distributed Query Processing semantics which preserves the SPARQL language expressiveness. Afterwards, we presented several strategies for a federated query engine that transparently addresses distributed data sources, while managing data partitioning, query results completeness, data replication, and query processing performance. We implemented and evaluated our approach and optimization strategies in a federated query engine to prove their effectiveness.
Document type :
Complete list of metadata

Cited literature [96 references]  Display  Hide  Download
Contributor : Abes Star :  Contact
Submitted on : Friday, November 29, 2019 - 2:32:00 PM
Last modification on : Sunday, May 1, 2022 - 3:16:13 AM


Version validated by the jury (STAR)


  • HAL Id : tel-02340700, version 3



Abdoul Macina. SPARQL distributed query processing over linked data. Other [cs.OH]. COMUE Université Côte d'Azur (2015 - 2019), 2018. English. ⟨NNT : 2018AZUR4230⟩. ⟨tel-02340700v3⟩



Record views


Files downloads