Abstract : Although our everyday life and society now depends heavily on
communication infrastructures and computation infrastructures,
scientists and engineers have always been among the main consumers of
computing power. This document provides a coherent overview of the
research I have conducted in the last 15 years and which targets the
management and performance evaluation of large scale distributed
computing infrastructures such as clusters, grids, desktop grids,
volunteer computing platforms, ... when used for scientific computing.
In the first part of this document, I present how I have addressed
scheduling problems arising on distributed platforms (like computing
grids) with a particular emphasis on heterogeneity and multi-user
issues, hence in connection with game theory. Most of these problems
are relaxed from a classical combinatorial optimization formulation
into a continuous form, which allows to easily account for key
platform characteristics such as heterogeneity or complex topology
while providing efficient practical and distributed solutions.
The second part presents my main contributions to the SimGrid project,
which is a simulation toolkit for building simulators of distributed
applications (originally designed for scheduling algorithm evaluation
purposes). It comprises a unified presentation of how the questions of
validation and scalability have been addressed in SimGrid as well as
thoughts on specific challenges related to methodological aspects and
to the application of SimGrid to the HPC context.