Load-balancing and resource-provisioning in large distributed systems

Mathieu Leconte 1, 2, 3
2 DYOGENE - Dynamics of Geometric Networks
CNRS - Centre National de la Recherche Scientifique : UMR8548, Inria Paris-Rocquencourt, DI-ENS - Département d'informatique de l'École normale supérieure
Abstract : The main theme of this thesis is load-balancing in large sparse random graphs. In the computer science context, a load-balancing problem occurs when we have a set of tasks which need to be distributed across multiple resources, and to resolve the load-balancing problem one needs to specify which tasks are going to be handled by which resources. Depending on the context, tasks and resources may have different interpretations. To make things more concrete, we focus in this document on two particular applications: - a multiple-choice hashing system (often refered to as cuckoo hashing in the literature), where the goal is to efficiently assign buckets to items so that the items or any associated data can be stored to be later retrieved quickly. Tasks are here linked to items, and resources to buckets. - a content delivery network (CDN) with a multitude of servers to handle storage and service of the contents. In this context, tasks are requests for a particular content and resources are linked with the servers and the particular contents they store, and resolving the load-balancing problem means assigning servers to requests. The local constraints of which resource is suitable for a particular task as well as the initial amounts of the different available resources and the workload associated with each task can be efficiently represented as a capacitated bipartite graph. Also, in practice and in particular for the two examples mentioned, the systems considered are often of very large size, involving maybe thousands of different tasks and resources, and they tend to be quite random (either by design or due to a lack of coordination capabilities). Therefore, the context of large random graphs is particularly well-suited to the considered evaluations. As the spectrum of solutions to a particular load-balancing problem is vast, it is primordial to understand the performance of the optimal solution to the loadbalancing problem (disregarding its potential complexity) in order to assess the relative efficiency of any given candidate scheme. This optimal load-balancing performance can be derived from the size of maximum capacitated matchings in a large sparse random graph. We analyze this quantity using the cavity method -a powerful tool coming from the study of disordered systems in statistical physics-, showing in the process how to rigorously apply this method to the setups of interest for our work. Coming back to the cuckoo hashing example, we obtain the load thresholds under which cuckoo hashing succeeds with high probability in building a valid hashtable and further show that the same approach can handle other related schemes. In the distributed CDN context, the performance of load-balancing is not the end of the story, as an associated resource-placement problem naturally arises: in such a system, one can choose how to provision resources and how to pool them, i.e., how to replicate contents over the servers. Our study of capacitated matchings already yields the efficiency of static replications of contents under optimal load-balancing, and we further obtain the limits of the optimal replication when the storage capacity of servers increases. Finally, as optimal load-balancing may be too complex for many realistic distributed CDN systems, we address the issues of load-balancing performance and resource-placement optimization under a much simpler -random greedy- load-balancing scheme using mean-field large storage approximations. We also design efficient adaptive replication algorithms for this setup.
Liste complète des métadonnées

Cited literature [122 references]  Display  Hide  Download

https://tel.archives-ouvertes.fr/tel-00933645
Contributor : Mathieu Leconte <>
Submitted on : Monday, January 20, 2014 - 6:02:34 PM
Last modification on : Monday, January 28, 2019 - 9:04:42 AM
Document(s) archivé(s) le : Monday, April 21, 2014 - 6:20:25 PM

Identifiers

  • HAL Id : tel-00933645, version 1

Citation

Mathieu Leconte. Load-balancing and resource-provisioning in large distributed systems. Data Structures and Algorithms [cs.DS]. Telecom ParisTech, 2013. English. ⟨tel-00933645⟩

Share

Metrics

Record views

863

Files downloads

567