Multisite Management of Scientific Workflows in the Cloud

Ji Liu 1, 2, 3
1 ZENITH - Scientific Data Management
LIRMM - Laboratoire d'Informatique de Robotique et de Microélectronique de Montpellier, CRISAM - Inria Sophia Antipolis - Méditerranée
Abstract : Scientific Workflows (SWfs) allow scientists to easily express multi-step computational activities, such as load input data files, process the data, run analyses, and aggregate the results. A SWf describes the dependencies between activities, typically as a graph where the nodes are activities and the edges express the activity dependencies. SWfs are often data-intensive, i.e. process, manage or produce huge amounts of data. In order to execute data-intensive SWfs within a reasonable time, Scientific Workflow Management Systems (SWfMSs) can be used and deployed in High Performance Computing (HPC) environments (cluster, grid or cloud). By offering stable services and virtually infinite computing, and storage resources at a reasonable cost, the cloud becomes appealing for SWf execution. SWfMSs can be easily deployed in the cloud using Virtual Machines (VMs). A cloud is typically made of several sites (or data centers), each with its own resources and data. Since a SWf may process data located at different sites, SWf execution should be adapted to a multisite cloud while exploiting distributed computing or storage resources. In this thesis, we study the problem of efficiently executing data-intensive SWfs in a multisite cloud, where each site has its own cluster, data and programs. Most SWfMSs have been designed for computer clusters or grids, and some have been extended to operate in the cloud, but only for single site. To address the problem in the multisite case, we propose a distributed and parallel approach that leverages the resources available at different cloud sites. To exploit parallelism, we use an algebraic approach, which allows expressing SWf activities using operators and automatically transforming them into multiple tasks. The main contribution is a multisite architecture for SWfMSs and distributed techniques to execute SWfs. The main techniques consist of SWf partitioning algorithms, a dynamic VM provisioning algorithm, an activity scheduling algorithm and a task scheduling algorithm. SWf partitioning algorithms partition a SWf to several fragments, each to be executed at a different cloud site. The VM provisioning algorithm is used to dynamically create an optimal combination of VMs for executing workflow fragments at each cloud site. The activity scheduling algorithm distributes the SWf fragments to the cloud sites based on a multi-objective cost model, which combines both execution time and monetary cost. The task scheduling algorithm directly distributes tasks among different cloud sites while achieving load balancing at each site. Our experiments show that our approach can reduce considerably the overall cost of SWf execution in a multisite cloud.
Complete list of metadatas

Cited literature [175 references]  Display  Hide  Download

https://tel.archives-ouvertes.fr/tel-01400625
Contributor : Ji Liu <>
Submitted on : Monday, December 12, 2016 - 9:40:09 AM
Last modification on : Friday, March 15, 2019 - 1:15:08 AM
Long-term archiving on: Tuesday, March 28, 2017 - 12:06:44 AM

Identifiers

  • HAL Id : tel-01400625, version 2

Collections

Citation

Ji Liu. Multisite Management of Scientific Workflows in the Cloud. Distributed, Parallel, and Cluster Computing [cs.DC]. Université de Montpellier, 2016. English. ⟨tel-01400625v2⟩

Share

Metrics

Record views

664

Files downloads

585