Simulation approach for resource management

Millian Poquet 1, 2
2 DATAMOVE - Data Aware Large Scale Computing
Inria Grenoble - Rhône-Alpes, LIG - Laboratoire d'Informatique de Grenoble
Abstract : Computing platforms increasingly grow in power and complexity.Numerous challenges remain to build next generations of platforms,but exploiting the platforms is a challenge per se.Constraints such as energy consumption, data movements and resiliencerisk to initiate breaking points in the way that the platforms aremanaged --- especially with the convergence of the different types ofdistributed platforms.Resource and Jobs Management Systems (RJMSs) are critical middlewaresthat allow users to exploit the resources of such platforms.They must evolve to make the best use of the computing platforms whilecomplying with these new constraints.Each evolution ideally require many iterations, but conducting them in vivois not reasonable due to huge overhead.Simulation is an efficient way to tackle the subsequent problems,but particular caution must be taken when drawing results from simulationas using ill-suited models may lead to invalid results.The first contribution of this thesis is the proposition of a modularsimulation methodology to study RJMSs and their evolution realistically --- andthe related simulator Batsim.The main idea is to strongly separate the simulation from the decision-makingalgorithms.This allows separation of concerns as any algorithm can benefit from a validatedsimulation with multiple levels of realism (features, accuracy of the models).This methodology improves the production launch of new policies since bothacademic prototypes and production RJMSs can be studied in the same context.Batsim is used in the second part of this thesis,which focuses on online and non-clairvoyant resource management policies tosave energy.Several algorithms are first proposed and analyzed to maximize performancesunder an energy budget for a given time period.This thesis then explores more generally possible energy and performancestrade-offs that can be obtained with node shutdown techniques.
Complete list of metadatas

Cited literature [98 references]  Display  Hide  Download

https://tel.archives-ouvertes.fr/tel-01757245
Contributor : Abes Star <>
Submitted on : Thursday, October 4, 2018 - 10:02:06 AM
Last modification on : Wednesday, July 10, 2019 - 1:25:35 AM

File

POQUET_2017_archivage.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : tel-01757245, version 2

Citation

Millian Poquet. Simulation approach for resource management. Data Structures and Algorithms [cs.DS]. Université Grenoble Alpes, 2017. English. ⟨NNT : 2017GREAM098⟩. ⟨tel-01757245v2⟩

Share

Metrics

Record views

159

Files downloads

144