Skip to Main content Skip to Navigation
Theses

Description, deployment and optimization of medical image analysis workflows on production grids

Abstract : Grids are interesting platforms for supporting the development of medical image analysis applications: they enable data and algorithms sharing and provide huge amounts of computing power and data storage. In this thesis, we investigate a medical image analysis problem that turns out to be a typical dimensioning application for grids, thus leading to develop new workflow description, implementation and optimization methods and tools. The basic application problem is the evaluation of medical image registration algorithms in absence of ground truth. Results obtained with a statistical method applied to a registration problem dealing with the follow-up of brain tumors in radiotherapy are presented. Those results allow to detect subtle flaws among the data. We extend this validation scheme in order to quantify the impact of lossy image compression on registration algorithms. This application is representative of typical grid problems so that we study its deployment and execution on such infrastructures. We adopt a generic workflow model to ease the application parallelization on a grid infrastructure. A novel taxonomy of workflow approaches is presented. Based on it, we select a suitable workflow language and we design and implement MOTEUR, an enactor exploiting all the parallelism levels of workflow applications. A new data composition operator is also defined, easing the description of medical image analysis applications on grids. Benchmarks on the EGEE production grid compared to controlled conditions on Grid'5000 reveal that the grid latency and its variability lead to strong performance drops. Therefore, we propose a probabilistic model of the execution time of a grid workflow. This model is user-centric: the whole grid is considered as a black-box introducing a random latency on the execution time of a job. Based on this model, we propose three optimization strategies aiming at reducing the impact of the grid latency and of its variability: grouping sequentially linked jobs reduces the mean latency faced by a workflow, optimizing the timeout value of jobs reduces the impact of outliers and optimizing the jobs granularity reduces the risk to face high latencies. Significant speed-up are yielded by those strategies.
Document type :
Theses
Complete list of metadata

Cited literature [202 references]  Display  Hide  Download

https://tel.archives-ouvertes.fr/tel-00460156
Contributor : Estelle Nivault <>
Submitted on : Friday, February 26, 2010 - 1:46:14 PM
Last modification on : Monday, October 12, 2020 - 10:30:28 AM
Long-term archiving on: : Friday, June 18, 2010 - 8:26:16 PM

File

glatard_2007.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : tel-00460156, version 1

Collections

Citation

Tristan Glatard. Description, deployment and optimization of medical image analysis workflows on production grids. Human-Computer Interaction [cs.HC]. Université de Nice Sophia Antipolis, 2007. English. ⟨tel-00460156⟩

Share

Metrics

Record views

372

Files downloads

724