Skip to Main content Skip to Navigation

Load-balancing and resource-provisioning in large distributed systems

Mathieu Leconte 1, 2, 3
2 DYOGENE - Dynamics of Geometric Networks
DI-ENS - Département d'informatique de l'École normale supérieure, Inria Paris-Rocquencourt, CNRS - Centre National de la Recherche Scientifique : UMR8548
Abstract : The main theme of this thesis is load-balancing in large sparse random graphs. In the computer science context, a load-balancing problem occurs when we have a set of tasks which need to be distributed across multiple resources, and to resolve the load-balancing problem one needs to specify which tasks are going to be handled by which resources. Depending on the context, tasks and resources may have different interpretations. To make things more concrete, we focus in this document on two particular applications: - a multiple-choice hashing system (often refered to as cuckoo hashing in the literature), where the goal is to efficiently assign buckets to items so that the items or any associated data can be stored to be later retrieved quickly. Tasks are here linked to items, and resources to buckets. - a content delivery network (CDN) with a multitude of servers to handle storage and service of the contents. In this context, tasks are requests for a particular content and resources are linked with the servers and the particular contents they store, and resolving the load-balancing problem means assigning servers to requests. The local constraints of which resource is suitable for a particular task as well as the initial amounts of the different available resources and the workload associated with each task can be efficiently represented as a capacitated bipartite graph. Also, in practice and in particular for the two examples mentioned, the systems considered are often of very large size, involving maybe thousands of different tasks and resources, and they tend to be quite random (either by design or due to a lack of coordination capabilities). Therefore, the context of large random graphs is particularly well-suited to the considered evaluations. As the spectrum of solutions to a particular load-balancing problem is vast, it is primordial to understand the performance of the optimal solution to the loadbalancing problem (disregarding its potential complexity) in order to assess the relative efficiency of any given candidate scheme. This optimal load-balancing performance can be derived from the size of maximum capacitated matchings in a large sparse random graph. We analyze this quantity using the cavity method -a powerful tool coming from the study of disordered systems in statistical physics-, showing in the process how to rigorously apply this method to the setups of interest for our work. Coming back to the cuckoo hashing example, we obtain the load thresholds under which cuckoo hashing succeeds with high probability in building a valid hashtable and further show that the same approach can handle other related schemes. In the distributed CDN context, the performance of load-balancing is not the end of the story, as an associated resource-placement problem naturally arises: in such a system, one can choose how to provision resources and how to pool them, i.e., how to replicate contents over the servers. Our study of capacitated matchings already yields the efficiency of static replications of contents under optimal load-balancing, and we further obtain the limits of the optimal replication when the storage capacity of servers increases. Finally, as optimal load-balancing may be too complex for many realistic distributed CDN systems, we address the issues of load-balancing performance and resource-placement optimization under a much simpler -random greedy- load-balancing scheme using mean-field large storage approximations. We also design efficient adaptive replication algorithms for this setup.
Complete list of metadata

Cited literature [122 references]  Display  Hide  Download
Contributor : Mathieu Leconte Connect in order to contact the contributor
Submitted on : Monday, January 20, 2014 - 6:02:34 PM
Last modification on : Thursday, November 18, 2021 - 4:10:35 AM
Long-term archiving on: : Monday, April 21, 2014 - 6:20:25 PM


  • HAL Id : tel-00933645, version 1


Mathieu Leconte. Load-balancing and resource-provisioning in large distributed systems. Data Structures and Algorithms [cs.DS]. Telecom ParisTech, 2013. English. ⟨tel-00933645⟩



Record views


Files downloads