Skip to Main content Skip to Navigation

Designing safe and highly available distributed applications

Abstract : Designing distributed applications involves a fundamental trade-off between safety and performance as described by CAP theorem. We focus on the cases where safety is the top requirement. For the subclass of state-based distributed systems, we propose a proof methodology for establishing that a given application maintains a given invariant. Our approach allows reasoning about individual operations separately. We demonstrate that our rules are sound, and with a mechanized proof engine, we illustrate their use with some representative examples. For conflicting operations, the developer can choose between conflict resolution or coordination. We present a novel replicated tree data structure that supports coordination-free concurrent atomic moves, and arguably maintains the tree invariant. Our analysis identifies cases where concurrent moves are inherently safe. For the remaining cases we devise a conflict resolution algorithm. The trade-off is that in some cases a move operation "loses". Given the coordination required by some application for safety, it can be implemented in many different ways. Even restricting to locks, they can use various configurations, differing by lock granularity, type, and placement. The performance of each configuration depends on workload. We study the "coordination lattice", i.e., design space of lock configurations, and define a set of metrics to systematically navigate them.
Complete list of metadata
Contributor : Sreeja Nair <>
Submitted on : Thursday, September 9, 2021 - 2:10:08 PM
Last modification on : Saturday, September 11, 2021 - 3:40:32 AM


Files produced by the author(s)


  • HAL Id : tel-03339393, version 1


Sreeja Nair. Designing safe and highly available distributed applications. Distributed, Parallel, and Cluster Computing [cs.DC]. Sorbonne Université, 2021. English. ⟨tel-03339393⟩



Record views


Files downloads