Toward an autonomic engine for scientific workflows and elastic Cloud infrastructure

Abstract : The constant development of scientific and industrial computation infrastructures requires the concurrent development of scheduling and deployment mechanisms to manage such infrastructures. Throughout the last decade, the emergence of the Cloud paradigm raised many hopes, but achieving full platformautonomicity is still an ongoing challenge. Work undertaken during this PhD aimed at building a workflow engine that integrated the logic needed to manage workflow execution and Cloud deployment on its own. More precisely, we focus on Cloud solutions with a dedicated Data as a Service (DaaS) data management component. Our objective was to automate the execution of workflows submitted by many users on elastic Cloud resources.This contribution proposes a modular middleware infrastructure and details the implementation of the underlying modules:• A workflow clustering algorithm that optimises data locality in the context of DaaS-centeredcommunications;• A dynamic scheduler that executes clustered workflows on Cloud resources;• A deployment manager that handles the allocation and deallocation of Cloud resources accordingto the workload characteristics and users’ requirements. All these modules have been implemented in a simulator to analyse their behaviour and measure their effectiveness when running both synthetic and real scientific workflows. We also implemented these modules in the Diet middleware to give it new features and prove the versatility of this approach.Simulation running the WASABI workflow (waves analysis based inference, a framework for the reconstruction of gene regulatory networks) showed that our approach can decrease the deployment cost byup to 44% while meeting the required deadlines.
Complete list of metadatas

Cited literature [48 references]  Display  Hide  Download

https://tel.archives-ouvertes.fr/tel-01988995
Contributor : Abes Star <>
Submitted on : Tuesday, January 22, 2019 - 11:02:19 AM
Last modification on : Thursday, November 21, 2019 - 2:35:49 AM

File

CROUBOIS_Hadrien_2018LYSEN061_...
Version validated by the jury (STAR)

Identifiers

  • HAL Id : tel-01988995, version 1

Citation

Hadrien Croubois. Toward an autonomic engine for scientific workflows and elastic Cloud infrastructure. Distributed, Parallel, and Cluster Computing [cs.DC]. Université de Lyon, 2018. English. ⟨NNT : 2018LYSEN061⟩. ⟨tel-01988995⟩

Share

Metrics

Record views

118

Files downloads

95