Spatial Query Optimization and Distributed Data Server - Application in the Management of Big Astronomical Surveys

Abstract : The big scientific data generated by modern observation telescopes, raises recurring problems of performances, in spite of the advances in distributed data management systems. The main reasons are the complexity of the systems and the difficulty to adapt the access methods to the data. This thesis proposes new physical and logical optimizations to optimize execution plans of astronomical queries using transformation rules. These methods are integrated in ASTROIDE, a distributed system for large-scale astronomical data processing.ASTROIDE achieves scalability and efficiency by combining the benefits of distributed processing using Spark with the relevance of an astronomical query optimizer.It supports the data access using the query language ADQL that is commonly used.It implements astronomical query algorithms (cone search, kNN search, cross-match, and kNN join) tailored to the proposed physical data organization.Indeed, ASTROIDE offers a data partitioning technique that allows efficient processing of these queries by ensuring load balancing and eliminating irrelevant partitions. This partitioning uses an indexing technique adapted to astronomical data, in order to reduce query processing time.
Complete list of metadatas

Cited literature [89 references]  Display  Hide  Download

https://tel.archives-ouvertes.fr/tel-02100861
Contributor : Abes Star <>
Submitted on : Tuesday, April 16, 2019 - 12:05:54 PM
Last modification on : Tuesday, May 14, 2019 - 4:53:18 AM

File

75067_BRAHEM_2019_archivage.pd...
Version validated by the jury (STAR)

Identifiers

  • HAL Id : tel-02100861, version 1

Collections

Citation

Mariem Brahem. Spatial Query Optimization and Distributed Data Server - Application in the Management of Big Astronomical Surveys. Databases [cs.DB]. Université Paris-Saclay, 2019. English. ⟨NNT : 2019SACLV009⟩. ⟨tel-02100861⟩

Share

Metrics

Record views

140

Files downloads

107