L. Abdelhakim, FEEL++ applications to engineering problems, AIP Conference Proceedings, vol.1978, p.55, 2018.

A. M. Aji, J. Dinan, D. Buntinas, P. Balaji, . Wu-chun-feng et al., MPI-ACC: An integrated and extensible approach to data movement in accelerator-based systems, 2012 IEEE 14th International Conference on High Performance Computing and Communication & 2012 IEEE 9th International Conference on Embedded Software and Systems, p.223, 2012.

A. Aji, . Mandayam, J. Antonio, P. Pena, W. Balaji et al., Automatic command queue scheduling for task-parallel workloads in opencl, 2015 IEEE International Conference on Cluster Computing. IEEE, p.223, 2015.

T. Amada, M. Imura, Y. Yasumuro, Y. Manabe, and K. Chihara, Particle-based fluid simulation on GPU, ACM workshop on generalpurpose computing on graphics processors, vol.41, p.56, 2004.

C. Anderson and C. Greengard, On vortex methods, SIAM journal on numerical analysis, vol.22, p.19, 1985.

P. Angot, C. Bruneau, and P. Fabrie, A penalization method to take into account obstacles in incompressible viscous flows, Numerische Mathematik 81, vol.4, p.70, 1999.

S. F. Antao, A. Bataev, C. Arpith, G. Jacob, A. E. Bercea et al., Offloading support for OpenMP in Clang and LLVM, Proceedings of the Third Workshop on LLVM Compiler Infrastructure in HPC, p.47, 2016.

S. V. Apte, K. Mahesh, P. Moin, and J. C. Oefelein, Large-eddy simulation of swirling particle-laden flows in a coaxial-jet combustor, International Journal of Multiphase Flow, vol.29, p.28, 2003.

E. Arge, H. Bruaset, and . Petter-langtangen, Modern software tools for scientific computing, p.133, 1997.

M. Arora, The architecture and evolution of cpu-gpu systems for general purpose computing, San Diago, vol.27, p.45, 2012.

G. Balarac, G. Cottet, J. Etancelin, J. Lagaert, F. Pérignon et al., Multi-scale problems, high performance computing and hybrid numerical methods, The Impact of Applications on Mathematics, p.134, 2014.
URL : https://hal.archives-ouvertes.fr/hal-00949669

A. Baqais, . Assayony, M. Khan, and . Al-mouhamed, Bank Conflict-Free Access for CUDA-Based Matrix Transpose Algorithm on GPUs, International Conference on Computer Applications Technology, p.160, 2013.

J. Baranger, F. Garbey, and . Oudin-dardun, On Aitken like acceleration of Schwarz domain decomposition method using generalized Fourier, Domain Decomposition Methods in Science and Engineering, p.223, 2003.

. Bibliography,

B. Barker, Message passing interface (mpi), Workshop: High Performance Computing on Stampede, vol.262, p.47, 2015.

B. Barney, POSIX threads programming, National Laboratory, p.48, 2009.

C. Bassi, A. Abbà, L. Bonaventura, and L. Valdettaro, Direct and Large Eddy Simulation of three-dimensional non-Boussinesq gravity currents with a high order DG method, p.31, 2018.

G. K. Batchelor, Small-scale variation of convected quantities like temperature in turbulent fluid Part 1. General discussion and the case of small conductivity, Journal of Fluid Mechanics, vol.5, issue.1, pp.113-133, 1959.

J. Beale, A. Thomas, and . Majda, Vortex methods. II. Higher order accuracy in two and three dimensions, Mathematics of Computation 39, vol.159, p.64, 1982.

D. M. Beazley, SWIG: An Easy to Use Tool for Integrating Scripting Languages with C and C++, Tcl/Tk Workshop, p.232, 1996.

C. Beckermann and R. Viskanta, Double-diffusive convection during dendritic solidification of a binary mixture, PhysicoChemical Hydrodynamics 10, vol.2, p.33, 1988.

S. Behnel, R. Bradshaw, C. Citro, L. Dalcin, K. Dag-sverre-seljebotn et al., Cython: The best of both worlds, Computing in Science & Engineering 13, vol.2, p.58, 2011.

N. Bell and J. Hoberock, Thrust: A productivity-oriented library for CUDA, GPU computing gems Jade edition, p.151, 2012.

T. Benjamin and . Brooke, Gravity currents and related phenomena, Journal of Fluid Mechanics, vol.31, p.29, 1968.

D. Bercovici, A theoretical model of cooling viscous gravity currents with temperature-dependent viscosity, Geophysical Research Letters, vol.21, p.29, 1994.

L. Bergstrom, Measuring NUMA effects with the STREAM benchmark, p.164, 2011.

K. Bhat, clpeak: A tool which profiles OpenCL devices to find their peak capacities, vol.232, p.50, 2017.

E. Biegert, B. Vowinckel, and E. Meiburg, A collision model for grain-resolving simulations of flows over dense, mobile, polydisperse granular sediment beds, Journal of Computational Physics, vol.340, p.28, 2017.

V. K. Birman and E. Meiburg, High-resolution simulations of gravity currents, Journal of the Brazilian Society of Mechanical Sciences and Engineering, issue.2, p.31, 2006.

V. K. Birman, M. Meiburg, and . Ungarish, On gravity currents in stratified ambients, Physics of Fluids, vol.19, p.31, 2007.

G. A. Blaauw and F. Jr, Computer architecture: Concepts and evolution, p.41, 1997.

L. Bluestein, A linear filtering approach to the computation of discrete Fourier transform, IEEE Transactions on Audio and Electroacoustics 18, vol.4, p.99, 1970.

B. M. Bode, J. J. Hill, and T. R. Benjegerdes, Cluster interconnect overview, Proceedings of USENIX 2004 Annual Technical Conference, FREENIX Track, p.187, 2004.

M. Boivin, O. Simonin, and K. Squires, Direct numerical simulation of turbulence modulation by particles in isotropic turbulence, Journal of Fluid Mechanics, vol.375, p.28, 1998.

R. T. Bonnecaze, E. Herbert, J. Huppert, and . Lister, Particle-driven gravity currents, Journal of Fluid Mechanics, vol.250, p.29, 1993.

R. T. Bonnecaze and J. Lister, Particle-driven gravity currents down planar slopes, Journal of Fluid Mechanics, vol.390, p.30, 1999.

J. Boussinesq, Essai sur la théorie des eaux courantes. Impr. nationale (cit, p.14, 1877.

T. Brandvik and G. Pullan, SBLOCK: A framework for efficient stencilbased PDE solvers on multi-core platforms, 2010 10th IEEE International Conference on Computer and Information Technology, p.90, 2010.

. Britter and . Simpson, Experiments on the dynamics of a gravity current head, Journal of Fluid Mechanics, vol.88, p.29, 1978.

D. L. Brown, Performance of under-resolved two-dimensional incompressible flow simulations, Journal of Computational Physics, vol.122, issue.1, p.145, 1995.

. Burns, J. Keaton, G. M. Vasil, J. S. Oishi, D. Lecoanet et al., Dedalus: Flexible framework for spectrally solving differential equations, Astrophysics Source Code Library, p.57, 2016.

P. Burns and E. Meiburg, Sediment-laden fresh water above salt water: linear stability analysis, Journal of Fluid Mechanics, vol.691, pp.279-314, 2012.

, Sediment-laden fresh water above salt water: nonlinear simulations, Journal of Fluid Mechanics, vol.762, pp.156-195, 2015.

J. C. Butcher, Coefficients for the study of Runge-Kutta integration processes, Journal of the Australian Mathematical Society, vol.3, issue.2, p.81, 1963.

J. Butcher, J. C. Charles, and . Butcher, The numerical analysis of ordinary differential equations: Runge-Kutta and general linear methods, vol.512, p.81, 1987.

Y. A. Buyevich, Statistical hydromechanics of disperse systems Part 1. Physical background and general equations, Journal of Fluid Mechanics, vol.49, p.27, 1971.

D. R. Caldwell, Thermal and Fickian diffusion of sodium chloride in a solution of oceanic concentration, Deep Sea Research and Oceanographic Abstracts, vol.20, p.35, 1973.

. Canon, L. Louis-claude, B. Marchal, F. Simon, and . Vivien, Online scheduling of task graphs on hybrid platforms, European Conference on Parallel Processing, p.146, 2018.
URL : https://hal.archives-ouvertes.fr/hal-01720064

C. Canuto, Y. Hussaini, A. Quarteroni, and . Thomas, Spectral methods in fluid dynamics, p.71, 2012.

C. Cao, J. Dongarra, P. Du, M. Gates, P. Luszczek et al., clMAGMA: High performance dense linear algebra with OpenCL, Proceedings of the International Workshop on OpenCL, p.159, 2013.

C. Carvalho, The gap between processor and memory speeds, Proc. of IEEE International Conference on Control and Automation, p.43, 2002.

J. R. Cary, G. Svetlana, . Shasharina, C. Julian, J. Cummings et al., Comparison of C++ and Fortran 90 for object-oriented scientific programming, Computer Physics Communications 105.1, p.133, 1997.

Ç. Engel, A. Yunus, H. Robert, J. M. Turner, and . Cimbala, Fundamentals of thermalfluid sciences, vol.703, p.17, 2001.

J. L. Cercos-pita, AQUAgpusph, a new free 3D SPH solver accelerated with OpenCL, Computer Physics Communications, vol.192, p.56, 2015.

S. Chan and K. Ho, Fast algorithms for computing the discrete cosine transform, IEEE Transactions on Circuits and Systems II: Analog and Digital Signal Processing, vol.39, p.176, 1992.

. Chan and . Ho, Direct methods for computing discrete sinusoidal transforms, IEE Proceedings F (Radar and Signal Processing, p.176, 1990.

P. Chatelain, A. Curioni, M. Bergdorf, D. Rossinelli, W. Andreoni et al., Billion vortex particle direct numerical simulations of aircraft wakes, Computer Methods in Applied Mechanics and Engineering, vol.197, p.56, 2008.

, Vortex methods for massively parallel computer architectures, International Conference on High Performance Computing for Computational Science, p.56, 2008.

J. Chauchat, Z. Cheng, T. Nagel, C. Bonamy, and T. Hsu, SedFoam-2.0: a 3-D two-phase flow numerical model for sediment transport, Geoscientific Model Development, vol.10, p.55, 2017.
URL : https://hal.archives-ouvertes.fr/hal-01652998

C. F. Chen and D. H. Johnson, Double-diffusive convection: a report on an engineering foundation conference, Journal of Fluid Mechanics, vol.138, p.33, 1984.

C. Chen, J. Frank, and . Millero, Precise equation of state of seawater for oceanic ranges of salinity, temperature and pressure, In: Deep Sea Research, vol.24, p.33, 1977.

G. Chen, Q. Xiong, J. Philip, E. G. Morris, A. Paterson et al., OpenFOAM for computational fluid dynamics, p.55, 2014.

A. Chorin and . Joel, Numerical study of slightly viscous flow, Journal of fluid mechanics, vol.57, pp.785-796, 1973.

, Vortex sheet approximation of boundary layers, Journal of computational physics, vol.27, p.19, 1978.

A. Chrzeszczyk, Matrix Computations on the GPU with ArrayFire-Python and ArrayFire-C/C++ (cit, p.151, 2017.

N. Cohen, P. Passaggia, A. Scotti, and B. White, Experiments on turbulent horizontal convection at high Schmidt numbers, Bulletin of the American Physical Society, vol.63, p.224, 2018.

N. Compute, PTX: Parallel thread execution ISA version 2.3, p.48, 2010.

D. Cooke and T. Hochberg, Numexpr. Fast evaluation of array expressions by using a vector-based virtual machine (cit, p.157, 2017.

J. W. Cooley and J. W. Tukey, An algorithm for the machine calculation of complex Fourier series, Mathematics of computation 19, vol.90, pp.297-301, 1965.

M. Coquerelle and G. Cottet, A vortex level set method for the two-way coupling of an incompressible fluid with colliding rigid bodies, Journal of Computational Physics, vol.227, p.28, 2008.
URL : https://hal.archives-ouvertes.fr/hal-00297673

P. Costa, A FFT-based finite-difference solver for massively-parallel direct numerical simulations of turbulent flows, Computers & Mathematics with Applications, vol.76, p.57, 2018.

G. Cottet, G. Balarac, and M. Coquerelle, Subgrid particle resolution for the turbulent transport of a passive scalar, Advances in Turbulence XII, pp.779-782, 2009.
URL : https://hal.archives-ouvertes.fr/hal-01587516

G. Cottet, J. Etancelin, F. Pérignon, and C. Picard, High order semi-Lagrangian particle methods for transport equations: numerical analysis and implementation issues, ESAIM: Mathematical Modelling and Numerical Analysis, vol.48, pp.1029-1060, 2014.
URL : https://hal.archives-ouvertes.fr/hal-00991150

G. Cottet, A new approach for the analysis of vortex methods in two and three dimensions, Annales de l'Institut Henri Poincare (C) Non Linear Analysis, vol.5, p.19, 1988.

, Semi-Lagrangian particle methods for hyperbolic problems, 2016.

G. Cottet, D. Petros, . Koumoutsakos, D. Petros, and . Koumoutsakos, Vortex methods: theory and practice, 2000.

G. Cottet and A. Magni, TVD remeshing formulas for particle methods, Comptes Rendus Mathematique, vol.347, pp.1367-1372, 2009.
URL : https://hal.archives-ouvertes.fr/hal-00321323

G. Cottet, J. Etancelin, F. Perignon, C. Picard, F. D. Vuyst et al., Is GPU the future of Scientific Computing ?" en, Annales mathématiques Blaise Pascal, vol.20, p.71, 2013.

R. Courant, E. Isaacson, and M. Rees, On the solution of nonlinear hyperbolic differential equations by finite differences, Communications on Pure and Applied Mathematics, vol.5, p.71, 1952.

. Bibliography,

R. Courant, K. Friedrichs, and H. Lewy, On the partial difference equations of mathematical physics, IBM journal of Research and Development, vol.11, issue.2, p.66, 1967.

J. Crank and P. Nicolson, A practical method for numerical evaluation of solutions of partial differential equations of the heat-conduction type, Mathematical Proceedings of the Cambridge Philosophical Society, vol.43, p.89, 1947.

C. Cummins, P. Petoumenos, M. Steuwer, and H. Leather, Autotuning OpenCL workgroup size for stencil patterns, p.90, 2015.

L. Dagum and R. Menon, OpenMP: An industry-standard API for shared-memory programming, Computing in Science & Engineering, vol.1, p.47, 1998.

L. Dalcín, R. Paz, and M. Storti, MPI for Python, Journal of Parallel and Distributed Computing, vol.65, p.230, 2005.

L. Dalcin, M. Mortensen, and D. E. Keyes, Fast parallel multidimensional FFT using advanced MPI, Journal of Parallel and Distributed Computing, vol.128, pp.137-150, 2019.

K. Datta, M. Murphy, V. Volkov, S. Williams, J. Carter et al., Stencil computation optimization and auto-tuning on state-of-the-art multicore architectures, Proceedings of the 2008 ACM/IEEE conference on Supercomputing, vol.159, p.90, 2008.

K. Datta and K. A. Yelick, Auto-tuning stencil codes for cache-based multicore platforms, p.159, 2009.

D. Davies, . Rhodri, S. C. Cian-r-wilson, and . Kramer, Fluidity: A fully unstructured anisotropic adaptive mesh computational modeling framework for geodynamics, Geochemistry, Geophysics, Geosystems, vol.12, issue.6, p.55, 2011.

D. Vuyst and F. , Performance modeling of a compressible hydrodynamics solver on multicore CPUs, Parallel Computing: On the Road to Exascale, vol.27, p.56, 2016.

D. Vuyst, T. Florian, R. Gasc, M. Motte, R. Peybernes et al., Lagrange-flux schemes: reformulating second-order accurate Lagrange-remap schemes for better node-based HPC performance, Oil & Gas Science and Technology-Revue d'IFP Energies nouvelles 71, vol.6, p.56, 2016.
URL : https://hal.archives-ouvertes.fr/hal-01958905

G. Deacon, Water circulation and surface boundaries in the oceans, Quarterly Journal of the Royal Meteorological Society, vol.71, p.33, 1945.

D. Demidov, VexCL: Vector expression template library for OpenCL (cit, p.151, 2012.

I. V. Derevich and L. I. Zaichik, Particle deposition from a turbulent flow, Fluid Dynamics 23.5, p.27, 1988.

R. Dolbeau, F. Bodin, and G. Verdiere, One OpenCL to rule them all?, In: 2013 IEEE 6th International Workshop on Multi-/Many-core Computing Systems (MuCoCoS). IEEE, pp.1-6, 2013.

J. Dongarra and . Heroux, Toward a new metric for ranking high performance computing systems, Sandia Report, p.43, 2013.

J. J. Dongarra, P. Luszczek, and A. Petitet, The LINPACK benchmark: past, present and future, Concurrency and Computation: practice and experience 15, vol.9, p.42, 2003.

D. A. Donzis and P. K. Yeung, Resolution effects and scaling in numerical simulations of passive scalar mixing in turbulence, Physica D: Nonlinear Phenomena 239, vol.14, p.20, 2010.

D. A. Drew and . Stephen-l-passman, Theory of multicomponent fluids, vol.135, p.26, 2006.

P. Du, R. Weber, P. Luszczek, S. Tomov, G. Peterson et al., From CUDA to OpenCL: Towards a performance-portable solution for multi-platform GPU programming, Parallel Computing 38, vol.8, p.159, 2012.

P. Duhamel and H. Hollmann, Split radix'FFT algorithm, Electronics letters 20.1, p.99, 1984.

R. Duncan, A survey of parallel computer architectures, Computer 23, vol.2, p.51, 1990.

P. M. Duvall, S. Matyas, and A. Glover, Continuous integration: improving software quality and reducing risk. Pearson Education (cit, p.151, 2007.

D. A. Edmonds and R. L. Slingerland, Significant effect of sediment cohesion on delta morphology, Nature Geoscience, vol.3, issue.2, p.22, 2010.

S. Elghobashi, A two-equation turbulence model for two-phase flows, Physics of Fluids, vol.26, p.27, 1983.

, Particle-laden turbulent flows: direct simulation and closure models, Applied Scientific Research, vol.48, p.24, 1991.

, On predicting particle-laden turbulent flows, Applied scientific research 52, vol.4, p.25, 1994.

J. Etancelin, Model coupling and hybrid computing for multi-scale CFD, 2014.
URL : https://hal.archives-ouvertes.fr/tel-01094645

J. Etancelin, G. Cottet, F. Pérignon, and C. Picard, Multi-CPU and multi-GPU hybrid computations of multi-scale scalar transport, 26th International Conference on Parallel Computational Fluid Dynamics, pp.83-84, 2014.
URL : https://hal.archives-ouvertes.fr/hal-00982986

L. Euler, Principes généraux de l'état d'équilibre des fluides, Mémoires de l'académie des sciences de Berlin, p.7, 1757.

R. W. Faas, Time and density-dependent properties of fluid mud suspensions, NE Brazilian continental shelf, p.23, 1984.

G. M. Faeth, Mixing, transport and combustion in sprays, Progress in energy and combustion science 13, vol.4, p.27, 1987.

. Bibliography,

T. L. Falch and A. C. Elster, Machine learning based auto-tuning for enhanced opencl performance portability, 2015 IEEE International Parallel and Distributed Processing Symposium Workshop. IEEE, p.160, 2015.

J. Fang, A. L. Varbanescu, and H. Sips, A comprehensive performance comparison of CUDA and OpenCL, 2011 International Conference on Parallel Processing, pp.216-225, 2011.

R. I. Ferguson and M. Church, A simple universal equation for grain settling velocity, Journal of sedimentary Research, vol.74, pp.933-937, 2004.

T. Filiba, blog/Code-Generation-Context-Managers/ (cit, p.153, 2012.

B. Fornberg, Generation of finite difference formulas on arbitrarily spaced grids, Mathematics of computation 51, vol.184, p.86, 1988.

, A practical guide to pseudospectral methods, vol.1, p.71, 1998.

I. T. Foster, H. Patrick, and . Worley, Parallel algorithms for the spectral transform method, SIAM Journal on Scientific Computing, vol.18, p.192, 1997.

J. Fredsøe, Mechanics of coastal sediment transport, Adv. Ser". In: Ocean Eng, vol.3, p.22, 1992.

H. Freundlich and . Jones, Sedimentation Volume, Dilatancy, Thixotropic and Plastic Properties of Concentrated Suspensions, In: The Journal of Physical Chemistry, vol.40, p.23, 1936.

M. Frigo and S. , FFTW: An adaptive software architecture for the FFT, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, vol.3, p.229, 1998.

, The design and implementation of FFTW3, Proceedings of the IEEE 93, vol.2, pp.216-231, 2005.

, FFTW: Fastest Fourier transform in the west, Astrophysics Source Code Library, 2012.

H. Fu, J. Liao, J. Yang, L. Wang, Z. Song et al., The Sunway TaihuLight supercomputer: system and applications". In: Science China Information Sciences 59, p.41, 2016.

E. Gabriel, G. E. Fagg, G. Bosilca, T. Angskun, J. Jack et al., Open MPI: Goals, concept, and design of a next generation MPI implementation, European Parallel Virtual Machine/Message Passing Interface Users' Group Meeting, p.47, 2004.

M. Garbey and D. Tromeur-dervout, On some Aitken-like acceleration of the Schwarz method, International journal for numerical methods in fluids 40, vol.12, p.223, 2002.

W. Gautschi, Orthogonal polynomials, p.119, 2004.

T. Geller, Supercomputing's exaflop target, Communications of the ACM, vol.54, p.43, 2011.

W. Gentleman, G. Morven, and . Sande, Fast Fourier Transforms: for fun and profit, Proceedings of the November 7-10, p.99, 1966.

R. J. Gibbs, Estuarine flocs: their size, settling velocity and density, Journal of Geophysical Research: Oceans 90.C2, p.37, 0197.

I. Good and . John, The interaction algorithm and practical Fourier analysis, Journal of the Royal Statistical Society: Series B, issue.2, p.99, 1958.

T. Gotoh, H. Hatanaka, and . Miura, Spectral compact difference hybrid computation of passive scalar in isotropic turbulence, Journal of Computational Physics, vol.231, pp.7398-7414, 2012.

D. Gottlieb and S. A. Orszag, Numerical analysis of spectral methods: theory and applications, vol.26, p.98, 1977.

D. Goz, L. Tornatore, G. Taffoni, and G. Murante, Cosmological Simulations in Exascale Era, p.56, 2017.

P. Grandclement, Introduction to spectral methods, Stellar fluid dynamics and numerical simulations: from the sun to neutron stars, 2006.
URL : https://hal.archives-ouvertes.fr/hal-00091367

. Dubrulle, 20 pages, 15 figures. France: EAS Publication series, vol.21, p.119

. Grauer-gray, L. Scott, R. Xu, S. Searles, J. Ayalasomayajula et al., Auto-tuning a high-level language targeted to GPU codes, 2012 Innovative Parallel Computing (InPar). Ieee, p.159, 2012.

T. Green, The importance of double diffusion to the settling of suspended material, p.36, 1987.

S. Guelton, P. Brunet, M. Amini, A. Merlini, X. Corbillon et al., Pythran: Enabling static optimization of scientific python programs, Computational Science & Discovery, vol.8, issue.1, p.58, 2015.

A. Gupta and V. Kumar, The scalability of FFT on parallel computers, IEEE Transactions on Parallel and, p.192, 1993.

, Questions in fluid mechanics, Journal of Fluids Engineering, vol.117, p.12, 1995.

A. Hamuraru, Atomic operations for floats in OpenCL, p.168, 2016.

F. H. Harlow and J. Welch, Numerical calculation of time-dependent viscous incompressible flow of fluid with free surface, The physics of fluids 8.12, p.86, 1965.

W. B. Hart, Fast library for number theory: an introduction, International Congress on Mathematical Software, p.230, 2010.

C. Härtel, E. Meiburg, and F. Necker, Analysis and direct numerical simulation of the flow at a gravity-current head. Part 1. Flow topology and front speed for slip and no-slip boundaries, Journal of Fluid Mechanics, vol.418, p.30, 2000.

D. Haworth, S. B. Connell, and . Pope, A generalized Langevin model for turbulent flows, The Physics of fluids 29, vol.2, p.27, 1986.

Q. He, H. Chen, and J. Feng, Acceleration of the OpenFOAMbased MHD solver using graphics processing units, Fusion Engineering and Design, vol.101, p.55, 2015.

B. Hejazialhosseini, D. Rossinelli, C. Conti, and P. Koumoutsakos, High throughput software for direct numerical simulations of compressible two-phase flows, Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis, p.56, 2012.

S. Hemmert, Green hpc: From nice to necessity, Computing in Science & Engineering, vol.12, issue.8, p.43, 2010.

. Henry, A. Sylvain, D. Denis, M. Barthou, R. Counilh et al., Toward OpenCL automatic multi-device support, European Conference on Parallel Processing, p.223, 2014.
URL : https://hal.archives-ouvertes.fr/hal-01005765

A. Hérault, G. Bilotta, and R. Dalrymple, Sph on gpu with cuda, Journal of Hydraulic Research 48.S1, p.56, 2010.

T. Higuchi, N. Yoshifuji, T. Sakai, Y. Kitta, R. Takano et al., ClPy: A NumPy-Compatible Library Accelerated with OpenCL, 2019 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW). IEEE, p.151, 2019.

K. Hillewaert, Direct numerical simulation of the taylor-green vortex at Re= 1600, 2nd international high order cfd workshop. On the WWW, vol.182, p.181, 2012.

F. Hjulstrom, Studies of the morphological activity of rivers as illustrated by the River Fyris, Bulletin". In: Geological Institute Upsalsa, vol.25, p.24, 1935.

T. Hoefler, J. Dinan, R. Thakur, B. Barrett, P. Balaji et al., Remote memory access programming in MPI-3, ACM Transactions on Parallel Computing, vol.2, issue.2, p.47, 2015.

G. Hoffmann, M. M. Nasr-azadani, and E. Meiburg, Sediment wave formation caused by erosional and depositional turbidity currents: A numerical investigation, Procedia IUTAM 15, vol.55, p.31, 2015.

J. Holewinski, L. Pouchet, and P. Sadayappan, Highperformance code generation for stencil computations on GPU architectures, Proceedings of the 26th ACM international conference on Supercomputing. ACM, p.90, 2012.

B. Hoomans, W. J. Kuipers, W. Briels, and . Van-swaaij, Discrete particle simulation of bubble and slug formation in a two-dimensional gas-fluidised bed: a hard-sphere approach, Chemical Engineering Science, vol.51, p.27, 1996.

D. Houk and T. Green, Descent rates of suspension fingers, Deep Sea Research and Oceanographic Abstracts, vol.20, pp.757-761, 1973.

D. C. Hoyal, I. Marcus, J. Bursik, and . Atkinson, The influence of diffusive convection on sedimentation from buoyant plumes, Marine Geology, vol.159, p.37, 1999.

D. W. Hughes and M. Proctor, Magnetic fields in the solar convection zone: magnetoconvection and magnetic buoyancy, Annual Review of Fluid Mechanics, vol.20, issue.1, p.33, 1988.

J. Hunt, Concurrency with AsyncIO, Advanced Guide to Python 3 Programming, p.223, 2019.

H. E. Huppert, The propagation of two-dimensional and axisymmetric viscous gravity currents over a rigid horizontal surface, Journal of Fluid Mechanics, vol.121, p.29, 1982.

H. E. Huppert, C. Peter, and . Manins, Limiting conditions for salt-fingering at an interface, Deep Sea Research and Oceanographic Abstracts, vol.20, p.36, 1973.

H. E. Huppert and J. Turner, Double-diffusive convection, Journal of Fluid Mechanics, vol.106, p.33, 1981.

H. E. Huppert, R. Stephen, and J. Sparks, Double-diffusive convection due to crystallization in magmas, Annual Review of Earth and Planetary Sciences, vol.12, issue.1, p.33, 1984.

K. E. Hyland, M. W. Mckee, and . Reeks, Derivation of a pdf kinetic equation for the transport of particles in turbulent flows, Journal of Physics A: Mathematical and General, vol.32, p.27, 1999.

A. Hynninen and . Dmitry-i-lyakh, cutt: A high-performance tensor transpose library for cuda compatible gpus, vol.223, p.162, 2017.

B. Ibrahimoglu and . Ali, Lebesgue functions and Lebesgue constants in polynomial interpolation, Journal of Inequalities and Applications, vol.1, p.119, 2016.

M. C. Ingham, The salinity extrema of the world ocean, vol.34, p.33, 1965.

P. Jääskeläinen, C. Sánchez-de, L. Lama, E. Schnetter, K. Raiskila et al., pocl: A performance-portable OpenCL implementation, International Journal of Parallel Programming, vol.43, p.48, 2015.

C. Jacobi, Untersuchungenüber die Differentialgleichung der hypergeometrischen Reihe, Journal für die reine und angewandte Mathematik, vol.56, p.119, 1859.

E. Jeannot and G. Mercier, Near-optimal placement of MPI processes on hierarchical NUMA architectures, European Conference on Parallel Processing, p.47, 2010.
URL : https://hal.archives-ouvertes.fr/inria-00544346

M. Jedouaa, C. Bruneau, and E. Maitre, An efficient interface capturing method for a large collection of interacting bodies immersed in a fluid, Journal of Computational Physics, vol.378, p.28, 2019.
URL : https://hal.archives-ouvertes.fr/hal-01236468

Z. Jia, M. Maggioni, B. Staiger, and D. P. Scarpazza, Dissecting the nvidia volta gpu architecture via microbenchmarking, p.174, 2018.

. Bibliography,

J. L. Jodra, I. Gurrutxaga, and J. Muguerza, Efficient 3D transpositions in graphics processing units, International Journal of Parallel Programming, vol.43, p.91, 2015.

F. Johansson, Arb: a C library for ball arithmetic, In: ACM Comm. Computer Algebra, vol.47, issue.3/4, p.230, 2013.

S. G. Johnson, Notes on FFT-based differentiation, MIT Applied Mathematics April, p.100, 2011.

B. Kadoch, D. Kolomenskiy, P. Angot, and K. Schneider, A volume penalization method for incompressible flows and scalar advection-diffusion with moving obstacles, Journal of Computational Physics, vol.231, p.70, 2012.
URL : https://hal.archives-ouvertes.fr/hal-01032208

D. E. Kelley, Fluxes through diffusive staircases: A new formulation, Journal of Geophysical Research: Oceans 95.C3, p.36, 1990.

D. E. Kelley, . Fernando, . Gargett, and E. Tanny, The diffusive regime of double-diffusive convection, Progress in Oceanography, vol.56, issue.3-4, p.33, 2003.

T. Kempe and J. Fröhlich, An improved immersed boundary method with direct forcing for the simulation of particle laden flows, Journal of Computational Physics, vol.231, p.28, 2012.

R. Keys, Cubic convolution interpolation for digital image processing, IEEE transactions on acoustics, speech, and signal processing 29, vol.6, p.72, 1981.

A. Khajeh-saeed and . Blair-perot, Direct numerical simulation of turbulence using GPU accelerated supercomputers, Journal of Computational Physics, vol.235, p.187, 2013.

M. A. Khodkar, E. Nasr-azadani, and . Meiburg, Gravity currents propagating into two-layer stratified fluids: vorticity-based models, Journal of Fluid Mechanics, vol.844, p.31, 2018.

O. W. Khronos and . Group, The OpenCL specification version 1.1, p.149, 2010.

, The OpenCL Specification, 2011.

H. Kim and R. Bond, Multicore software technologies, IEEE Signal Processing Magazine, vol.26, p.47, 2009.

V. Kindratenko and P. Trancoso, Trends in high-performance computing, In: Computing in Science & Engineering, vol.13, p.44, 2011.

A. Kleen, A numa api for linux, Novel Inc (cit, vol.163, p.47, 2005.

A. Klöckner, N. Pinto, Y. Lee, B. Catanzaro, P. Ivanov et al., PyCUDA and PyOpenCL: A scripting-based approach to GPU run-time code generation, Parallel Computing, vol.38, pp.157-174, 2012.

H. Komatsu, A characterization of real analytic functions, Proceedings of the Japan Academy, vol.36, issue.3, p.106, 1960.

N. Konopliv, E. Lesshafft, and . Meiburg, The influence of shear on double-diffusive and settling-driven instabilities, Journal of Fluid Mechanics, vol.849, pp.902-926, 2018.
URL : https://hal.archives-ouvertes.fr/hal-02322685

E. Konstantinidis and Y. Cotronis, A practical performance model for compute and memory bound GPU kernels, 23rd Euromicro International Conference on Parallel, Distributed, and Network-Based Processing, p.232, 2015.

, A quantitative roofline model for GPU kernel performance estimation using micro-benchmarks and hardware metric profiling, Journal of Parallel and Distributed Computing, vol.107, p.54, 2017.

P. Koumoutsakos, Inviscid axisymmetrization of an elliptical vortex, Journal of Computational Physics, vol.138, p.82, 1997.

P. Koumoutsakos and . Leonard, High-resolution simulations of the flow around an impulsively started cylinder using vortex methods, Journal of Fluid Mechanics, vol.296, p.74, 1995.

R. Krishnamurti, Y. Jo, and A. Stocchino, Salt fingers at low Rayleigh numbers, Journal of Fluid Mechanics, vol.452, p.36, 2002.

J. Kuerten and . Gm, Point-Particle DNS and LES of Particle-Laden Turbulent flow-a state-of-the-art review, Flow, turbulence and combustion 97, vol.3, p.28, 2016.

W. Kutta, Beitrag zur naherungsweisen Integration totaler Differentialgleichungen, Z. Math. Phys, vol.46, p.80, 1901.

Y. Kwok and I. Ahmad, Static scheduling algorithms for allocating directed task graphs to multiprocessors, ACM Computing Surveys (CSUR) 31.4, p.146, 1999.

O. A. Ladyzhenskaya, The mathematical theory of viscous incompressible flow, vol.2, 1969.

J. Lagaert, G. Balarac, and G. Cottet, Hybrid spectral-particle method for the turbulent transport of a passive scalar, Journal of Computational Physics, vol.260, pp.127-142, 2014.
URL : https://hal.archives-ouvertes.fr/hal-00935487

J. -. Lagaert, G. Baptiste, G. Balarac, P. Cottet, and . Bégou, Particle method: an efficient tool for direct numerical simulations of a high Schmidt number passive scalar in turbulent flow, 2012.
URL : https://hal.archives-ouvertes.fr/hal-00748132

S. Lam, A. Kwan, S. Pitrou, and . Seibert, Numba: A llvm-based python jit compiler, Proceedings of the Second Workshop on the LLVM Compiler Infrastructure in HPC. ACM, vol.231, p.58, 2015.

C. Lameter, NUMA (Non-Uniform Memory Access): An Overview, Acm queue 11, vol.7, p.41, 2013.

A. Lani, N. Villedie, K. Bensassi, L. Koloszar, M. Vymazal et al., COOLFluiD: an open computational platform for multi-physics simulation and research, 21st AIAA Computational Fluid Dynamics Conference, p.56, 2013.

A. Lani, M. Sarp-yalim, and S. Poedts, A GPU-enabled Finite Volume solver for global magnetospheric simulations on unstructured grids, Computer Physics Communications, vol.185, p.56, 2014.

C. Lattner and V. Adve, LLVM: A compilation framework for lifelong program analysis & transformation, Proceedings of the international symposium on Code generation and optimization: feedback-directed and runtime optimization, vol.232, p.166, 2004.

A. Leonard, Numerical simulation of interacting three-dimensional vortex filaments, Proceedings of the Fourth International Conference on Numerical Methods in Fluid Dynamics, p.19, 1975.

, Computing three-dimensional incompressible flows with vortex elements, Annual Review of Fluid Mechanics, vol.17, p.19, 1985.

J. Leray, Sur le mouvement d'un liquide visqueux emplissant l'espace, Acta mathematica 63, p.7, 1934.

W. K. Lewis, The evaporation of a liquid into a gas, In: Trans. ASME, vol.44, p.17, 1922.

N. Li and S. Laizet, 2decomp & fft-a highly scalable 2d decomposition library and fft interface, Cray User Group 2010 conference, p.193, 2010.

W. Li, Z. Fan, X. Wei, and A. Kaufman, GPU-based flow simulation with complex boundaries, p.90, 2003.

Y. Li, Y. Zhang, Y. Liu, G. Long, and H. Jia, MPFFT: An auto-tuning FFT library for OpenCL GPUs, Journal of Computer Science and Technology, vol.28, p.159, 2013.

P. F. Linden, Salt fingers in the presence of grid-generated turbulence, Journal of Fluid Mechanics, vol.49, p.36, 1971.

E. List, C. Y. Robert, J. Koh, and . Imberger, Mixing in inland and coastal waters, p.204, 1979.

J. Liu, Numerical solution of forward and backward problem for 2-D heat conduction equation, Journal of Computational and Applied Mathematics, vol.145, p.105, 2002.

A. Logg and G. N. Wells, DOLFIN: Automated finite element computing, ACM Transactions on Mathematical Software (TOMS) 37, vol.2, p.55, 2010.

A. Logg, K. Mardal, and G. Wells, Automated solution of differential equations by the finite element method: The FEniCS book, vol.84, p.55, 2012.

R. Lord, Investigation of the character of the equilibrium of an incompressible heavy fluid of variable density, Scientific papers, p.201, 1900.

D. Luebke, CUDA: Scalable parallel programming for high-performance scientific computing, 5th IEEE international symposium on biomedical imaging, pp.836-838, 2008.

D. Luebke, M. Harris, N. Govindaraju, A. Lefohn, M. Houston et al., GPGPU: general-purpose computation on graphics hardware, Proceedings of the 2006 ACM/IEEE conference on Supercomputing, p.44, 2006.

D. I. Lyakh, An efficient tensor transpose algorithm for multicore CPU, Intel Xeon Phi, and NVidia Tesla GPU, Computer Physics Communications, vol.189, p.91, 2015.

A. Magni and G. Cottet, Accurate, non-oscillatory, remeshing schemes for particle methods, Journal of Computational Physics, vol.231, pp.152-172, 2012.
URL : https://hal.archives-ouvertes.fr/hal-00588848

J. Malcolm, P. Yalamanchili, C. Mcclanahan, V. Venugopalakrishnan, K. Patel et al., ArrayFire: a GPU acceleration platform, Modeling and simulation for defense systems and applications VII, vol.8403, p.151, 2012.

Z. Malecha, . Miros-law, . Tomczak, M. Koza, . Matyka et al., GPU-based simulation of 3D blood flow in abdominal aorta using Open-FOAM, Archives of Mechanics 63, vol.2, p.55, 2011.

S. A. Martucci, Symmetric convolution and the discrete sine and cosine transforms, IEEE Transactions on Signal Processing, vol.42, p.110, 1994.

J. Mathew and R. Vijayakumar, The Performance of Parallel Algorithms by Amdahl's Law, Gustafson's Trend, International Journal of Computer Science and Information Technologies, vol.2, issue.6, p.62, 2011.

H. G. Matuttis, . Luding, and . Herrmann, Discrete element simulations of dense packings and heaps made of spherical and non-spherical particles, Powder technology, vol.109, p.27, 2000.

T. Maxworthy, The dynamics of sedimenting surface gravity currents, Journal of Fluid Mechanics, vol.392, p.36, 1999.

M. D. Mccool, Scalable programming models for massively multicore processors, Proceedings of the IEEE 96, vol.5, p.47, 2008.

E. Meiburg and B. Kneller, Turbidity currents and their deposits, Annual Review of Fluid Mechanics, vol.42, p.1, 2010.

E. Meiburg, S. Radhakrishnan, and M. Nasr-azadani, Modeling gravity and turbidity currents: computational approaches and challenges, In: Applied Mechanics Reviews, vol.67, p.31, 2015.

H. Meuer, E. Werner, and . Strohmaier, Die TOP 500 Supercomputer in der Welt, Praxis der Informationsverarbeitung und Kommunikation 16, vol.4, p.42, 1993.

A. Meurer, C. P. Smith, M. Paprocki, . Ond?ej?ertík, B. Sergey et al., SymPy: symbolic computing in Python, PeerJ Computer Science, vol.3, p.230, 2017.
URL : https://hal.archives-ouvertes.fr/hal-01645958

M. Meyer, Continuous integration and its tools, IEEE software 31.3, p.151, 2014.

P. Micikevicius, 3D finite difference computation on GPUs using CUDA, Proceedings of 2nd workshop on general purpose processing on graphics processing units, vol.187, p.90, 2009.

C. Miller and . Cruickshank, Containing Papers of a Mathematical and Physical Character 106, Proceedings of the Royal Society of London. Series A, vol.740, p.197, 1924.

J. D. Milliman and K. L. Farnsworth, River discharge to the coastal ocean: a global synthesis, p.1, 2013.

C. Mimeau, Conception and implementation of a hybrid vortex penalization method for solid-fluid-porous media : application to the passive control of incompressible flows, 2015.
URL : https://hal.archives-ouvertes.fr/tel-01178939

C. Mimeau, G. Cottet, and I. Mortazavi, Passive flow control around a semi-circular cylinder using porous coatings, 2014.
URL : https://hal.archives-ouvertes.fr/hal-01215305

C. Mimeau, F. Gallizio, G. Cottet, and I. Mortazavi, Vortex penalization method for bluff body flows, International Journal for Numerical Methods in Fluids, vol.79, p.134, 2015.
URL : https://hal.archives-ouvertes.fr/hal-00936332

C. Mimeau, G. Cottet, and I. Mortazavi, Direct numerical simulations of three-dimensional flows past obstacles with a vortex penalization method, Computers & Fluids, vol.136, p.62, 2016.
URL : https://hal.archives-ouvertes.fr/hal-01855265

C. Mimeau, I. Mortazavi, and G. Cottet, Passive control of the flow around a hemisphere using porous media, European Journal of Mechanics-B/Fluids, vol.65, p.134, 2017.
URL : https://hal.archives-ouvertes.fr/hal-01483400

A. Mohanan, C. Vishnu, M. C. Bonamy, P. Linares, and . Augier, FluidSim: Modular, object-oriented python package for high-performance CFD simulations, p.57, 2018.
URL : https://hal.archives-ouvertes.fr/hal-02121919

J. J. Monaghan, Extrapolating B splines for interpolation, In: Journal of Computational Physics, vol.60, issue.2, p.82, 1985.

J. J. Monaghan, A. F. Ray, . Cas, M. Kos, and . Hallworth, Gravity currents descending a ramp in a stratified tank, Journal of Fluid Mechanics, vol.379, p.30, 1999.

A. Moncrieff, Classification of poorly-sorted sedimentary rocks, Sedimentary Geology, vol.65, pp.191-194, 1989.

G. E. Moore, Progress in digital integrated electronics, Electron Devices Meeting, vol.21, p.42, 1975.

M. Mortensen, Shenfun: High performance spectral Galerkin computing platform, In: J. Open Source Software, vol.3, p.57, 2018.

M. Mortensen and H. Petter-langtangen, High performance Python for direct numerical simulations of turbulent flows, Computer Physics Communications 203, p.57, 2016.

M. Mortensen, L. Dalcin, and D. E. Keyes, mpi4py-fft: Parallel Fast Fourier Transforms with MPI for Python, p.57, 2019.

V. Moureau, P. Domingo, and L. Vervisch, Design of a massively parallel CFD code for complex geometries, Comptes Rendus Mécanique, vol.339, p.134, 2011.
URL : https://hal.archives-ouvertes.fr/hal-01672172

M. Mouyen, L. Longuevergne, P. Steer, A. Crave, and J. Lemoine, Assessing modern river sediment discharge to the ocean using satellite gravimetry, Himanshu Save, and Cécile Robin, vol.1, p.1, 2018.
URL : https://hal.archives-ouvertes.fr/insu-01406500

T. Mulder and J. Syvitski, Turbidity currents generated at river mouths during exceptional discharges to the world oceans, The Journal of Geology, vol.103, pp.285-299, 1995.

T. Mulder, P. M. James, S. Syvitski, J. Migeon, B. Faugeres et al., Marine hyperpycnal flows: initiation, behavior and related deposits. A review, Marine and Petroleum Geology, vol.20, issue.6-8, pp.861-882, 2003.
URL : https://hal.archives-ouvertes.fr/hal-00407038

. Nasr-azadani and . Meiburg, Turbidity currents interacting with threedimensional seafloor topography, Journal of Fluid Mechanics, vol.745, p.31, 2014.

C. L. Navier and M. Henri, Sur les Lois des Mouvement des Fluides, en Ayant Egard a L'adhesion des Molecules, Ann. Chim. Paris, vol.19, p.7, 1821.

V. Neagoe, Chebyshev nonuniform sampling cascaded with the discrete cosine transform for optimum interpolation, IEEE transactions on acoustics, speech, and signal processing 38, vol.10, p.120, 1990.

. Necker, C. Frieder, L. Hartel, E. Kleiser, and . Meiburg, Direct numerical simulation of particle-driven gravity currents, TSFP digital library online, p.30, 1999.

N. Nettelmann, . Fortney, C. Moore, and . Mankovich, An exploration of double diffusive convection in Jupiter as a result of hydrogen-helium phase separation, Monthly Notices of the Royal Astronomical Society, vol.447, p.33, 2015.

J. Nickolls and W. J. Dally, The GPU computing era, p.45, 2010.

R. Nishino, S. Okuta-yuya-unno-daisuke, and . Loomis, CuPy: A NumPy-Compatible Library for NVIDIA GPU Calculations, 31st confernce on neural information processing systems, vol.158, p.151, 2017.

A. E. Nocentino, J. Philip, and . Rhodes, Optimizing memory access on GPUs using morton order indexing, Proceedings of the 48th Annual Southeast Regional Conference, p.90, 2010.

D. Novillo, OpenMP and automatic parallelization in GCC, the Proceedings of the GCC Developers Summit, p.47, 2006.

C. Nugteren and V. Codreanu, CLTune: A generic auto-tuner for OpenCL kernels, 2015 IEEE 9th International Symposium on Embedded Multicore/Many-core Systems-on-Chip. IEEE, p.160, 2015.

A. Nukada, Y. Ogata, T. Endo, and S. Matsuoka, Bandwidth intensive 3-D FFT kernel for GPUs using CUDA, Proceedings of the 2008 ACM/IEEE conference on Supercomputing, p.91, 2008.

N. A. Okong'o and J. Bellan, Consistent large-eddy simulation of a temporal mixing layer laden with evaporating drops. Part 1. Direct numerical simulation, formulation and a priori analysis, Journal of Fluid Mechanics, vol.499, p.28, 2004.

T. E. Oliphant, Python for scientific computing, Computing in Science & Engineering, vol.9, p.133, 2007.

. Bibliography,

T. E. Oliphant, SciPy: Open source scientific tools for Python, Computing in Science and Engineering 9, p.230, 2007.

P. O'rourke and . John, Collective drop effects on vaporizing liquid sprays, Tech. rep. Los Alamos National Lab, p.28, 1981.

S. A. Orszag, Numerical methods for the simulation of turbulence, The Physics of Fluids 12, vol.12, p.98, 1969.

R. Ouillon, G. Nadav, V. Lensky, A. Lyakhovsky, E. Arnon et al., Halite precipitation from double-diffusive salt fingers in the Dead Sea: Numerical simulations, Water Resources Research, vol.55, p.37, 2019.

J. D. Owens, M. Houston, D. Luebke, S. Green, J. E. Stone et al., GPU computing". In: (cit, p.71, 2008.

J. D. Parsons, H. Macelo, and . Garcia, Enhanced sediment scavenging due to double-diffusive convection, Journal of Sedimentary Research, vol.70, issue.1, p.37, 2000.

D. Pekurovsky, P3DFFT: A framework for parallel computations of Fourier transforms in three dimensions, SIAM Journal on Scientific Computing, vol.34, pp.192-209, 2012.

P. Peterson, F2PY: a tool for connecting Fortran and Python programs, International Journal of Computational Science, p.231, 2009.

S. A. Piacsek and J. Toomre, Nonlinear evolution and structure of salt fingers, Elsevier Oceanography Series, vol.28, p.36, 1980.

M. Pippig, PFFT: An extension of FFTW to massively parallel architectures, SIAM Journal on Scientific Computing, vol.35, p.193, 2013.

. Poisson and . Papaud, Diffusion coefficients of major ions in seawater, Marine Chemistry 13, vol.4, p.35, 1983.

M. Pons and P. L. Quéré, Modeling natural convection with the work of pressure-forces: a thermodynamic necessity, International Journal of Numerical Methods for Heat & Fluid Flow, vol.17, p.16, 2007.
URL : https://hal.archives-ouvertes.fr/hal-01628609

W. Prager, Die druckverteilung an körpern in ebener potentialströmung, Physik. Zeitschr, vol.29, p.19, 1928.

J. Price and S. Mcintosh-smith, Oclgrind: An extensible OpenCL device simulator, Proceedings of the 3rd International Workshop on OpenCL, vol.232, p.166, 2015.

C. Prud'homme, V. Chabannes, V. Doyeux, M. Ismail, A. Samake et al., Feel++: A computational framework for galerkin methods and advanced numerical methods, ESAIM: Proceedings, vol.38, p.55, 2012.

C. M. Rader, Discrete Fourier transforms when the number of data samples is prime, Proceedings of the IEEE 56, vol.6, p.99, 1968.

P. Ramachandran, PySPH: a reproducible and high-performance framework for smoothed particle hydrodynamics, Proceedings of the 15th python in science conference, p.56, 2016.

M. Rashti, J. Javad, P. Green, A. Balaji, W. Afsahi et al., Multi-core and network aware MPI topology functions, European MPI Users' Group Meeting, p.47, 2011.

A. J. Raudkivi, Loose boundary hydraulics, p.38, 1998.

L. Rayleigh, On convection currents in a horizontal layer of fluid, when the higher temperature is on the under side, The London, vol.32, p.17, 1916.

G. W. Recktenwald, Finite-difference approximations to the heat equation, Mechanical Engineering, vol.10, p.105, 2004.

M. Reeks and . Wi, Eulerian direct interaction applied to the statistical motion of particles in a turbulent fluid, Journal of Fluid Mechanics, vol.97, p.27, 1980.

O. Reynolds, XXIX. An experimental investigation of the circumstances which determine whether the motion of water shall be direct or sinuous, and of the law of resistance in parallel channels, Philosophical Transactions of the Royal society of London, vol.174, p.16, 1883.

A. Ronacher, Jinja2 (the Python template engine) (cit, p.152, 2008.

L. Rosenhead, Containing Papers of a Mathematical and Physical Character 134, Proceedings of the Royal Society of London. Series A, vol.823, p.19, 1931.

A. Ross and . Neil, University of Cambridge (cit, p.30, 2000.

D. Rossinelli and P. Koumoutsakos, Vortex methods for incompressible flow simulations on the GPU, The Visual Computer, vol.24, pp.699-708, 2008.

D. Rossinelli, M. Bergdorf, G. Cottet, and P. Koumoutsakos, GPU accelerated simulations of bluff body flows using vortex particle methods, Journal of Computational Physics, vol.229, pp.3316-3333, 2010.
URL : https://hal.archives-ouvertes.fr/hal-00748016

D. Rossinelli, B. Hejazialhosseini, D. G. Spampinato, and P. Koumoutsakos, Multicore/multi-gpu accelerated simulations of multiphase compressible flows using wavelet adapted grids, SIAM Journal on Scientific Computing, vol.33, p.55, 2011.

D. Rossinelli, B. Hejazialhosseini, M. Wim-van-rees, M. Gazzola, P. Bergdorf et al., MRAG-I2D: Multi-resolution adapted grids for remeshed vortex methods on multicore architectures, Journal of Computational Physics, vol.288, p.57, 2015.

G. Ruetsch and P. Micikevicius, Optimizing matrix transpose in CUDA, Nvidia CUDA SDK Application Note 18, vol.160, p.91, 2009.

K. Rupp, CPU, GPU and MIC Hardware Characteristics over Time, 2013.

E. Rustico, G. Bilotta, G. Gallo, A. Herault, R. A. Del-negro et al., A journey from single-GPU to optimized multi-GPU SPH with CUDA, 7th SPHERIC Workshop, p.56, 2012.

T. Sakurai, K. Yoshimatsu, N. Okamoto, and K. Schneider, Volume penalization for inhomogeneous Neumann boundary conditions modeling scalar flux in complicated geometry, Journal of Computational Physics, vol.390, p.70, 2019.

M. L. Salihi and . Ould, Couplage de méthodes numériques en simulation directe d'écoulements incompressibles, p.82, 1998.

M. F. Sanner, Python: a programming language for software integration and development, J Mol Graph Model, vol.17, issue.1, p.133, 1999.

I. F. Sbalzarini, H. Jens, M. Walther, and . Bergdorf, PPM-A highly efficient parallel particle-mesh library for the simulation of continuum systems, Simone Elke Hieber, Evangelos M Kotsalis, and Petros Koumoutsakos, vol.215, pp.566-588, 2006.

J. C. Schatzman, Accuracy of the discrete Fourier transform and the fast Fourier transform, SIAM Journal on Scientific Computing, vol.17, p.99, 1996.

R. Schmid, Descriptive nomenclature and classification of pyroclastic deposits and fragments, Geologische Rundschau, vol.70, pp.794-799, 1981.

D. P. Schmidt and C. J. Rutland, A new droplet collision algorithm, Journal of Computational Physics, vol.164, p.28, 2000.

R. W. Schmitt, L. David, and . Evans, An estimate of the vertical mixing due to salt fingers based on observations in the North Atlantic Central Water, Journal of Geophysical Research: Oceans 83.C6, pp.2913-2919, 1978.

. Schmitt and W. Raymond, The growth rate of super-critical salt fingers, Deep Sea Research Part A. Oceanographic Research Papers, vol.26, p.35, 1979.

I. Schoenberg and . Jacob, Contributions to the problem of approximation of equidistant data by analytic functions, IJ Schoenberg Selected Papers, p.82, 1988.

B. Schulte, E. Konopliv, and . Meiburg, Clear salt water above sediment-laden fresh water: Interfacial instabilities, Physical Review Fluids, vol.1, issue.1, p.37, 2016.

J. D. Schwarzkopf, M. Sommerfeld, T. Clayton, Y. Crowe, and . Tsuji, Multiphase flows with droplets and particles, vol.29, p.26, 2011.

F. Schwertfirm and M. Manhart, A numerical approach for simulation of turbulent mixing and chemical reaction at high Schmidt numbers, Micro and Macro Mixing, p.224, 2010.

P. N. Segre, F. Liu, P. Umbanhowar, and D. Weitz, An effective gravitational temperature for sedimentation, Nature 409, vol.6820, p.594, 2001.

J. Shen, T. Tang, and L. Wang, Spectral methods: algorithms, analysis and applications, vol.41, p.125, 2011.

A. Shields, Application of similarity principles and turbulence research to bedload movement, p.22, 1936.

B. I. Shraiman and E. D. Siggia, Scalar turbulence, Nature 405, vol.6787, p.20, 2000.

B. Simon, Scheduling task graphs on modern computing platforms, p.146, 2018.
URL : https://hal.archives-ouvertes.fr/tel-01843558

O. Simonin, J. P. Deutsch, and . Minier, Eulerian prediction of the fluid/particle correlated motion in turbulent two-phase flows, In: Applied Scientific Research, vol.51, issue.1-2, p.27, 1993.

J. E. Simpson, Gravity currents: In the environment and the laboratory, p.29, 1999.

H. D. Sinclair, Tectonostratigraphic model for underfilled peripheral foreland basins: An Alpine perspective, America Bulletin, vol.109, p.29, 1997.

O. P. Singh, J. Ranjan, K. R. Srinivasan, and . Sreenivas, A study of basalt fingers using experiments and numerical simulations in double-diffusive systems, Journal of Geography and Geology, vol.3, issue.1, p.33, 2011.

O. P. Singh and J. Srinivasan, Effect of Rayleigh numbers on the evolution of doublediffusive salt fingers, Physics of Fluids, vol.26, p.33, 2014.

G. D. Smith and G. Smith, Numerical solution of partial differential equations: finite difference methods, p.85, 1985.

J. E. Smith, W. Hsu, and C. Hsiung, Future general purpose supercomputer architectures, Supercomputing'90: Proceedings of the 1990 ACM/IEEE Conference on Supercomputing. IEEE, p.41, 1990.

R. Smith, Performance of MPI Codes Written in Python with NumPy and mpi4py, 2016 6th Workshop on Python for High-Performance and Scientific Computing (PyHPC). IEEE, p.58, 2016.

B. P. Sommeijer, F. Laurence, J. G. Shampine, and . Verwer, RKC: An explicit solver for parabolic PDEs, Journal of Computational and Applied Mathematics, vol.88, p.223, 1998.

A. Sommerfeld, Ein beitrag zur hydrodynamischen erklaerung der turbulenten fluessigkeitsbewegungen (cit, p.16, 1908.

M. Sommerfeld, Validation of a stochastic Lagrangian modelling approach for inter-particle collisions in homogeneous isotropic turbulence, International Journal of Multiphase Flow, vol.27, p.28, 2001.

K. Spafford, J. Meredith, and J. Vetter, Maestro: data orchestration and tuning for OpenCL devices, European Conference on Parallel Processing, p.159, 2010.

V. Springel, The cosmological simulation code GADGET-2, Monthly notices of the royal astronomical society, vol.364, p.56, 2005.

P. Springer, A. Sankaran, and P. Bientinesi, TTC: A tensor transposition compiler for multiple architectures, Proceedings of the 3rd ACM SIGPLAN International Workshop on Libraries, Languages, and Compilers for Array Programming, vol.162, p.91, 2016.

P. Springer, T. Su, and P. Bientinesi, HPTT: a high-performance tensor transposition C++ library, Proceedings of the 4th ACM SIGPLAN International Workshop on Libraries, Languages, and Compilers for Array Programming, p.162, 2017.

K. D. Squires and . John-k-eaton, Measurements of particle dispersion obtained from direct numerical simulations of isotropic turbulence, Journal of Fluid Mechanics, vol.226, p.28, 1991.

K. R. Sreenivas, J. Singh, and . Srinivasan, Effect of Rayleigh numbers on the evolution of double-diffusive salt fingers, Physics of Fluid 21, p.32, 2009.

, On the relationship between finger width, velocity, and fluxes in thermohaline convection, Physics of Fluids 21, vol.2, p.36, 2009.

R. M. Stallman and Z. Weinberg, The C preprocessor, Free Software Foundation (cit, p.152, 1987.

A. Staniforth and J. Côté, Semi-Lagrangian integration schemes for atmospheric models-A review, Monthly weather review 119.9, p.74, 1991.

P. Steinbach and M. Werner, gearshifft-the fft benchmark suite for heterogeneous platforms, International Supercomputing Conference, pp.199-216, 2017.

M. E. Stern, The "salt-fountain" and thermohaline convection, Tellus 12, vol.2, p.33, 1960.

, Collective instability of salt fingers, Journal of Fluid Mechanics, vol.35, p.33, 1969.

, Ocean circulation physics, vol.19, p.35, 1975.

G. Stokes and . Gabriel, On the effect of the internal friction of fluids on the motion of pendulums, vol.9, 1851.

, On the theories of the internal friction of fluids in motion, and of the equilibrium and motion of elastic solids, Transactions of the Cambridge Philosophical Society, vol.8, 1880.

H. Stommel, An oceanographic curiosity: the perpetual salt fountain, Deep-Sea Res. 3, p.33, 1956.

H. S. Stone, An efficient parallel algorithm for the solution of a tridiagonal linear system of equations, Journal of the ACM (JACM) 20.1, pp, p.105, 1973.

J. E. Stone, D. Gohara, and G. Shi, OpenCL: A parallel programming standard for heterogeneous computing systems, Computing in science & engineering, vol.12, p.48, 2010.

G. Strang, On the construction and comparison of difference schemes, SIAM journal on numerical analysis, vol.5, p.67, 1968.

B. Su and K. Keutzer, clSpMV: A cross-platform OpenCL SpMV framework on GPUs, Proceedings of the 26th ACM international conference on Supercomputing. ACM, p.159, 2012.

C. Su, P. Chen, C. Lan, L. Huang, and K. Wu, Overview and comparison of OpenCL and CUDA technology for GPGPU, 2012 IEEE Asia Pacific Conference on Circuits and Systems, p.149, 2012.

H. Su, N. Wu, M. Wen, C. Zhang, and X. Cai, On the GPU performance of 3D stencil computations implemented in OpenCL, International Supercomputing Conference, p.90, 2013.

S. Subramaniam, Lagrangian-Eulerian methods for multiphase flows, Progress in Energy and Combustion Science, vol.39, p.27, 2013.

Y. Sugimoto, F. Ino, and K. Hagihara, Improving cache locality for GPU-based volume rendering, Parallel Computing, vol.40, issue.5-6, p.90, 2014.

S. Sundaram and L. R. Collins, Collision statistics in an isotropic particle-laden turbulent suspension. Part 1. Direct numerical simulations, Journal of Fluid Mechanics, vol.335, p.28, 1997.

D. C. Swailes, F. F. Kirsty, and . Darbyshire, A generalized Fokker-Planck equation for particle transport in random media, Physica A: Statistical Mechanics and its Applications, vol.242, p.27, 1997.

P. N. Swarztrauber, Vectorizing the ffts, Parallel computations, p.229, 1982.

J. Sweet, H. David, D. Richter, and . Thain, GPU acceleration of Eulerian-Lagrangian particle-laden turbulent flow simulations, International Journal of Multiphase Flow, vol.99, p.56, 2018.

T. Sych, OpenCL Device Fission for CPU Performance, p.163, 2013.

Z. Sylvester, Exploring grain settling with Python, p.38, 2013.

J. Taylor and P. Bucens, Laboratory experiments on the structure of salt fingers, Deep Sea Research Part A. Oceanographic Research Papers, vol.36, p.36, 1989.

C. Teisson, Cohesive suspended sediment transport: feasibility and limitations of numerical modeling, Journal of Hydraulic Research, vol.29, p.22, 1991.

A. R. Terrel, From equations to code: Automated scientific computing, Computing in Science & Engineering 13, vol.2, p.152, 2011.

T. Tezduyar, M. Aliabadi, . Behr, . Johnson, M. Kalro et al., Flow simulation and high performance computing, Computational Mechanics 18, vol.6, p.55, 1996.

J. Tompson and K. Schlachter, An introduction to the opencl programming model, vol.49, p.149, 2012.

A. Traxler, S. Stellmach, P. Garaud, N. Radko, and . Brummell, Dynamics of fingering convection. Part 1 Small-scale fluxes and large-scale instabilities, Journal of fluid mechanics, vol.677, p.36, 2011.

W. F. Trench, An algorithm for the inversion of finite Toeplitz matrices, Journal of the Society for Industrial and Applied Mathematics, vol.12, p.105, 1964.

N. Trevett, Opencl overview, Vortrag bei SIGGRAPH Asia. Zugriff am 26, p.48, 2012.

J. S. Turner, Multicomponent convection, Annual Review of Fluid Mechanics, vol.17, p.32, 1985.

M. Uhlmann, The need for de-aliasing in a Chebyshev pseudo-spectral method. Potsdam Institute for Climate Impact Research (cit, p.124, 2000.

, An immersed boundary method with direct forcing for the simulation of particulate flows, Journal of Computational Physics, vol.209, p.28, 2005.

M. Ungarish and H. E. Huppert, On gravity currents propagating at the base of a stratified ambient, Journal of Fluid Mechanics, vol.458, p.31, 2002.

. Valdez-balderas, . Daniel, M. José, . Domínguez, D. Benedict et al., Towards accelerating smoothed particle hydrodynamics simulations for freesurface flows on multi-GPU clusters, Journal of Parallel and Distributed Computing, vol.73, p.56, 2013.

. Van-der-walt, C. Stefan, G. Colbert, and . Varoquaux, The NumPy array: a structure for efficient numerical computation, Computing in Science & Engineering 13, vol.2, p.157, 2011.
URL : https://hal.archives-ouvertes.fr/inria-00564007

. Van-rees, M. Wim, A. Leonard, P. Di-pullin, and . Koumoutsakos, A comparison of vortex and pseudo-spectral methods for the simulation of periodic vortical flows at high Reynolds numbers, Journal of Computational Physics, vol.230, pp.181-183, 2011.

. Van-werkhoven, J. Ben, . Maassen, J. Frank, H. E. Seinstra et al., Performance models for CPU-GPU data transfers, 14th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing. IEEE, pp.11-20, 2014.

J. Vedurada, A. Suresh, A. Sukumaran-rajam, J. Kim, C. Hong et al., TTLG-an efficient tensor transposition library for gpus, 2018 IEEE International Parallel and Distributed Processing Symposium (IPDPS). IEEE, pp.578-588, 2018.

T. Veldhuizen, Expression templates, C++ Report 7.5, p.151, 1995.

A. Verma, A. E. Helal, K. Krommydas, and W. Feng, Accelerating workloads on fpgas via opencl: A case study with opendwarfs, p.90, 2016.

. Verma, K. Mahendra, A. Chatterjee, S. Reddy, K. Rakesh et al., Benchmarking and scaling studies of pseudospectral code Tarang for turbulence simulations, Pramana 81, vol.4, p.57, 2013.

B. Vowinckel, P. Withers, E. Luzzatto-fegiz, and . Meiburg, Settling of cohesive sediment: particle-resolved simulations, Journal of Fluid Mechanics, vol.858, p.28, 2019.

R. Vuduc, J. W. Demmel, and K. A. Yelick, OSKI: A library of automatically tuned sparse matrix kernels, Journal of Physics: Conference Series, vol.16, p.159, 2005.

M. Wagner, G. Llort, E. Mercadal, J. Giménez, and J. Labarta, Performance Analysis of Parallel Python Applications, Procedia Computer Science, vol.108, p.58, 2017.

M. Waldrop and . Mitchell, The chips are down for Moore's law, Nature News 530, vol.7589, p.42, 2016.

D. W. Walker, J. Jack, and . Dongarra, MPI: a standard message passing interface, Supercomputer 12, p.47, 1996.

H. Wang, S. Potluri, D. Bureddy, C. Rosales, and D. Panda, GPU-aware MPI on RDMA-enabled clusters: Design, implementation and evaluation, IEEE Transactions on Parallel and Distributed Systems, vol.25, p.141, 2013.

S. Wang and Y. Liang, A comprehensive framework for synthesizing stencil algorithms on FPGAs using OpenCL model, 54th ACM/EDAC/IEEE Design Automation Conference (DAC). IEEE, p.90, 2017.

Z. Wang, On computing the discrete Fourier and cosine transforms, IEEE transactions on acoustics, speech, and signal processing 33, vol.5, p.176, 1985.

T. Watanabe, Y. Naito, . Sakai, Y. Nagata, and . Ito, Mixing and chemical reaction at high Schmidt number near turbulent/nonturbulent interface in planar liquid jet, Physics of Fluids, vol.27, p.224, 2015.

R. Weber, A. Gothandaraman, J. Robert, G. Hinde, and . Peterson, Comparing hardware accelerators in scientific applications: A case study, IEEE Transactions on Parallel and Distributed Systems 22.1, p.44, 2010.

B. Werkhoven and . Van, Kernel Tuner: A search-optimizing GPU code auto-tuner, Future Generation Computer Systems, vol.90, p.160, 2019.

R. Whaley, . Clinton, J. Jack, and . Dongarra, Automatically tuned linear algebra software, SC'98: Proceedings of the 1998 ACM/IEEE conference on Supercomputing. IEEE, p.159, 1998.

T. Willhalm and N. Popovici, Putting intel® threading building blocks to work, Proceedings of the 1st international workshop on Multicore software engineering, p.56, 2008.

S. Williams, A. Waterman, and D. Patterson, Roofline: An insightful visual performance model for floating-point programs and multicore architectures, 2009.

J. Wu and J. Jaja, High performance FFT based poisson solver on a CPU-GPU heterogeneous platform, 2013 IEEE 27th International Symposium on Parallel and Distributed Processing, p.58, 2013.

A. Yariv, F. Evgeny, and S. Doron, OpenCL device partition by names extension, p.163, 2013.

X. Yu, T. Hsu, and S. Balachandar, Convective instability in sedimentation: Linear stability analysis, Journal of Geophysical Research: Oceans, vol.118, p.37, 2013.

, Convective instability in sedimentation: 3-D numerical study, Journal of Geophysical Research: Oceans, vol.119, pp.8141-8161, 2014.

. Zhang, X. Shu-rong, D. L. Lu, C. Higgitt, H. Chen et al., Water chemistry of the Zhujiang (Pearl River): natural processes and anthropogenic influences, Journal of Geophysical Research: Earth Surface, vol.112, p.23, 2007.

Y. Zhang, M. Sinclair, and A. Chien, Improving performance portability in OpenCL programs, International Supercomputing Conference, p.159, 2013.

Y. Zhang and F. Mueller, Auto-generation and auto-tuning of 3D stencil codes on GPU clusters, Proceedings of the Tenth International Symposium on Code Generation and Optimization, p.90, 2012.

Z. Zhang and . Prosperetti, A second-order method for three-dimensional particle simulation, Journal of Computational Physics, vol.210, issue.1, p.28, 2005.

H. Zohouri, A. Reza, S. Podobas, and . Matsuoka, Combined spatial and temporal blocking for high-performance stencil computation on FPGAs using OpenCL, Proceedings of the 2018 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, p.90, 2018.