Skip to Main content Skip to Navigation

Data management in a cloud federation

Abstract : Cloud federations can be seen as major progress in cloud computing, in particular in the medical domain. Indeed, sharing medical data would improve healthcare. Federating resources makes it possible to access any information even on a mobile person with distributed hospital data on several sites. Besides, it enables us to consider larger volumes of data on more patients and thus provide finer statistics. Medical data usually conform to the Digital Imaging and Communications in Medicine (DICOM) standard. DICOM files can be stored on different platforms, such as Amazon, Microsoft, Google Cloud, etc. The management of the files, including sharing and processing, on such platforms, follows the pay-as-you-go model, according to distinct pricing models and relying on various systems (Relational Data Management Systems or DBMSs or NoSQL systems). In addition, DICOM data can be structured following traditional (row or column) or hybrid (row-column) data storages. As a consequence, medical data management in cloud federations raises Multi-Objective Optimization Problems (MOOPs) for (1) query processing and (2) data storage, according to users preferences, related to various measures, such as response time, monetary cost, qualities, etc. These problems are complex to address because of heterogeneous database engines, the variability (due to virtualization, large-scale communications, etc.) and high computational complexity of a cloud federation. To solve these problems, we propose a MedIcal system on clouD federAtionS (MIDAS). First, MIDAS extends IReS, an open source platform for complex analytics workflows executed over multi-engine environments, to solve MOOP in the heterogeneous database engines. Second, we propose an algorithm for estimating of cost values in a cloud environment, called Dynamic REgression AlgorithM (DREAM). This approach adapts the variability of cloud environment by changing the size of data for training and testing process to avoid using the expire information of systems. Third, Non-dominated Sorting Genetic Algorithm based ob Grid partitioning (NSGA-G) is proposed to solve the problem of MOOP is that the candidate space is large. NSGA-G aims to find an approximate optimal solution, while improving the quality of the optimal Pareto set of MOOP. In addition to query processing, we propose to use NSGA-G to find an approximate optimal solution for DICOM data configuration. We provide experimental evaluations to validate DREAM, NSGA-G with various test problem and dataset. DREAM is compared with other machine learning algorithms in providing accurate estimated costs. The quality of NSGA-G is compared to other NSGAs with many problems in MOEA framework. The DICOM dataset is also experimented with NSGA-G to find optimal solutions. Experimental results show the good qualities of our solutions in estimating and optimizing Multi-Objective Problem in a cloud federation.
Document type :
Complete list of metadatas

Cited literature [168 references]  Display  Hide  Download
Contributor : Abes Star :  Contact
Submitted on : Tuesday, September 1, 2020 - 11:35:43 AM
Last modification on : Wednesday, September 9, 2020 - 4:18:36 AM


Version validated by the jury (STAR)


  • HAL Id : tel-02926972, version 1


Trung-Dung Le. Data management in a cloud federation. Mobile Computing. Université Rennes 1; Université d'Ottawa, 2019. English. ⟨NNT : 2019REN1S101⟩. ⟨tel-02926972⟩



Record views


Files downloads