Skip to Main content Skip to Navigation

An Algebraic Approach for Scientific Workflows with Large Scale Data

Eduardo Ogasawara 1, 2, 3
Abstract : Scientific workflows have emerged as a basic abstraction for structuring and executing scientific experiments in computational simulations. In many situations, these workflows are computationally and data intensive, thus requiring execution in large-scale parallel computers. However, the parallelization of scientific workflows is low-level, ad hoc and labor-intensive, which makes it hard to exploit optimization opportunities. To address the problem of optimizing the parallel execution of scientific workflows, we propose an algebraic approach to represent the workflow and a parallel execution model that together enable the automatic optimization of the parallel execution of scientific workflows. We conducted a thorough validation of our approach using both real applications and synthetic data scenarios. The experiments were run in Chiron, a data-centric scientific workflow engine implemented to parallelize scientific workflow execution. Our experiments demonstrated excellent parallel performance improvements obtained and evidenced through our algebraic approach several optimization opportunities when compared to ad hoc workflow implementation.
Document type :
Complete list of metadata
Contributor : Patrick Valduriez <>
Submitted on : Wednesday, January 4, 2012 - 9:11:47 AM
Last modification on : Monday, October 19, 2020 - 2:34:03 PM
Long-term archiving on: : Tuesday, December 13, 2016 - 6:28:33 PM


  • HAL Id : tel-00653661, version 1


Eduardo Ogasawara. An Algebraic Approach for Scientific Workflows with Large Scale Data. Databases [cs.DB]. Universidade Federal de Rio de Janeiro, 2011. Portuguese. ⟨tel-00653661v1⟩



Record views


Files downloads