Adaptive Consistency Protocols for Replicated Data in Modern Storage Systems with a High Degree of Elasticity

Abstract : The main contributions of this thesis are three folds. The first contribution of the thesis focuses on an efficient way to control stale reads in modern database systems with the help of a new consistency protocol called LibRe. LibRe is an acronym for Library for Replication. The main goal of the LibRe protocol is to ensure data consistency by contacting a minimum number of replica nodes during read and write operations with the help of a library information. According to the protocol, during write operations each replica node updates a registry (library) asynchronously with the recent version identifier of the updated data. Forwarding the read requests to a right replica node referring the registry information helps to control stale reads during read operations. Evaluation of data consistency remains challenging both via simulation as well as in a real world setup. Hence, we implemented a new simulation toolkit called Simizer that helps to evaluate the performance of different consistency policies in a fast and efficient way. We also extended an existing benchmark tool YCSB that helps to evaluate the consistency-latency tradeoff offered by modern database systems. The codebase of the simulator and the extended YCSB are made open-source for public access. The performance of the LibRe protocol is validated both via simulation as well as in a real setup with the help of extended YCSB.Although the modern database systems adapt the consistency guarantees of the system per query basis, anticipating the consistency level of an application query in advance during application development time remains challenging for the application developers. In order to overcome this limitation, the second contribution of the thesis focuses on enabling the database system to override the application-defined consistency options during run time with the help of an external input. The external input could be given by a data administrator or by an external service. The thesis validates the proposed model with the help of a prototype implementation inside the Cassandra distributed storage system.The third contribution of the thesis focuses on resolving update conflicts. Resolving update conflicts often involve maintaining all possible values and perform the resolution via domain-specific knowledge at the client side. This involves additional cost in terms of network bandwidth and latency, and considerable complexity. In this thesis, we discuss the motivation and design of a novel data type called priority register that implements a domain-specific conflict detection and resolution scheme directly at the database side, while leaving open the option of additional reconciliation at the application level. Our approach uses the notion of an application-defined replacement ordering and we show that a data type parameterized by such an order can provide an efficient solution for applications that demand domain-specific conflict resolution. We also describe the proof of concept implementation of the priority register inside Cassandra. The conclusion and perspectives of the thesis work are summarized at the end.
Complete list of metadatas

Cited literature [140 references]  Display  Hide  Download

https://tel.archives-ouvertes.fr/tel-01359621
Contributor : Abes Star <>
Submitted on : Friday, September 2, 2016 - 5:49:08 PM
Last modification on : Saturday, December 21, 2019 - 3:42:19 AM
Long-term archiving on: Sunday, December 4, 2016 - 9:16:03 AM

File

ThKUMARSathiya.pdf
Version validated by the jury (STAR)

Identifiers

  • HAL Id : tel-01359621, version 1

Collections

Citation

Sathiya Prabhu Kumar. Adaptive Consistency Protocols for Replicated Data in Modern Storage Systems with a High Degree of Elasticity. Document and Text Processing. Conservatoire national des arts et metiers - CNAM, 2016. English. ⟨NNT : 2016CNAM1035⟩. ⟨tel-01359621⟩

Share

Metrics

Record views

546

Files downloads

774