BlobSeer: Towards efficient data storage management for large-scale, distributed systems

Bogdan Nicolae 1
1 KerData - Scalable Storage for Clouds and Beyond
IRISA-D1 - SYSTÈMES LARGE ÉCHELLE, Inria Rennes – Bretagne Atlantique
Abstract : With data volumes increasing at a high rate and the emergence of highly scalable infrastructures (cloud computing, petascale computing), distributed management of data becomes a crucial issue that faces many challenges. This thesis brings several contributions in order to address such challenges. First, it proposes a set of principles for designing highly scalable distributed storage systems that are optimized for heavy data access concurrency. In particular, it highlights the potentially large benefits of using versioning in this context. Second, based on these principles, it introduces a series of distributed data and metadata management algorithms that enable a high throughput under concurrency. Third, it shows how to efficiently implement these algorithms in practice, dealing with key issues such as high-performance parallel transfers, efficient maintainance of distributed data structures, fault tolerance, etc. These results are used to build BlobSeer, an experimental prototype that is used to demonstrate both the theoretical benefits of the approach in synthetic benchmarks, as well as the practical benefits in real-life, applicative scenarios: as a storage backend for MapReduce applications, as a storage backend for deployment and snapshotting of virtual machine images in clouds, as a quality-of-service enabled data storage service for cloud applications. Extensive experimentations on the Grid'5000 testbed show that BlobSeer remains scalable and sustains a high throughput even under heavy access concurrency, outperforming by a large margin several state-of-art approaches.
Type de document :
Computer Science [cs]. Université Rennes 1, 2010. English
Liste complète des métadonnées

Littérature citée [165 références]  Voir  Masquer  Télécharger
Contributeur : Bogdan Nicolae <>
Soumis le : mercredi 5 janvier 2011 - 19:08:55
Dernière modification le : vendredi 16 novembre 2018 - 01:40:39
Document(s) archivé(s) le : lundi 5 novembre 2012 - 15:45:26



  • HAL Id : tel-00552271, version 1


Bogdan Nicolae. BlobSeer: Towards efficient data storage management for large-scale, distributed systems. Computer Science [cs]. Université Rennes 1, 2010. English. 〈tel-00552271〉



Consultations de la notice


Téléchargements de fichiers