.. .. Instrumented-semantics,

, Sufficient Condition for Correct Registration, p.182

.. .. Related-work,

. .. Concluding-remarks, 184 Conclusion and Future Work Contents 7.1 Context

, Large calculations that require the application of parallel computation are now commonplace. Not exclusively, but notably, in the natural sciences where increasing fidelity in simulations enable a more precise understanding of subjects as diverse as the origin of the universe and the composition of matter

. However, The well-known difficulties of developing correct sequential programs, and the disastrous effects when approached lightly (of which spectacular examples abound [65]), are exacerbated by the exponential number of interactions between processes. Additionally, the hoped for increase in computation power when applying parallel computing does not come for free. In all but embarrassingly parallel cases, a performance increase requires a well-thought out strategy for how to parallelize the problem at hand, demanding significant effort. It is also difficult to a priori ensure scaling and portability of the parallelization. The Bulk Synchronous Parallel model, and its commonly used implementation in the BSPlib library, answers some of these concerns by providing a structure of parallel computing that rules out certain classes of errors. Furthermore, it all-ok Take any i ? Pid

. Again and . Pid, By the premises of this rule, S[i], ?[i] ? i Wait(S ? [i]), ? ?? [i] and S ? , ? ?? ?? ? ? By the induction hypothesis

D. Pid-=-{?-?-d-pid-|-?-v-?}-where, RS(s, pi), Theorem 3. If (pi ? , pi ? ) PI(s)

, s))), then s is textually aligned for any environment in D V Pid . Proof: We prove the stronger property P(s) by structural induction on s

?. P(s)-??-??-?-l,-?pi-=-(pi, pi PI ? (s) ? RS(s, pi) ? ?? ? D ? 1 (pi ? (init(s))) pi such that pi PI ? (s), RS(s, pi), let V = ? 1 (pi ? (init(s))), ? ? D V Pid

. ?-s-?-x-:=-e, Then s = [X ? e], and since ? ? D Pid we have by the definition of

. =-?-s, We apply the induction hypothesis twice: (i) first to show that ( s 1 ?) = ? S , and (ii) to show that s 2 ( s 1 ?)

, with ? ? = pi ? (init(s 1 )). From the premises of RS(s, pi) we have RS(s 1 , pi), pi PI ? ? (s 1 ), vol.14

, If ? ? = ? then s ? = s 2 ? ? = ? = ? S and we are done, so assume ? ? = ?

?. V-?-?-?-where-v-?-=-?-1,

?. Pid, Let V ?? = ? 1 (pi ? (init(s 2 ))). By the induction hypothesis we get that s 2 is textually aligned for D V ?? Pid . From the above inclusion, we have V ?? ? V ? , i.e. ? V ?? ? ? , and since the semantic function of statements is stable by Lemma 4, i.e. ? ? ? D Pid, pi PI ? ?? (s 2 ), with ? ?? = pi ? (init, vol.14

S. Aananthakrishnan, G. Bronevetsky, M. Baranowski, and G. Gopalakrishnan, ParFuse: Parallel and Compositional Analysis of Message Passing Programs, Languages and Compilers for Parallel Computing, pp.24-39

S. Agarwal, R. Barik, V. Sarkar, and R. K. Shyamasundar, May-happenin-parallel Analysis of X10 Programs, Proceedings of the 12th ACM SIG-PLAN Symposium on Principles and Practice of Parallel Programming, PPoPP '07, vol.75, p.76, 2007.

D. H. Ahn, B. R. De-supinski, I. Laguna, G. L. Lee, B. Liblit et al., Scalable Temporal Order Analysis for Large Scale Debugging, Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis, SC '09, vol.44, pp.1-44, 2009.

,

A. V. Aho, R. Sethi, and J. D. Ullman, Compilers, Principles, Techniques, p.70, 1986.

A. Aiken and D. Gay, Barrier Inference, Proceedings of the 25th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, POPL '98, vol.73, p.154, 1998.

E. Albert, P. Arenas, S. Genaim, G. Puebla, and D. Zanardini, Cost Analysis of Java Bytecode, European Symposium on Programming, pp.157-172

. Springer, , vol.125, p.147, 2007.

E. Albert, P. Arenas, S. Genaim, and G. Puebla, Closed-Form Upper Bounds in Static Cost Analysis, Journal of Automated Reasoning, vol.46, issue.2, p.147, 2011.

. Bibliography,

E. Albert, J. Correas, and G. Román-díez, Peak cost analysis of distributed systems, International Static Analysis Symposium, p.78, 2014.

E. Albert, P. Arenas, J. Correas, S. Genaim, M. Gómez-zamalloa et al., Resource Analysis: From Sequential to Concurrent and Distributed Programs, FM 2015: Formal Methods: 20th International Symposium, vol.78, p.152, 2015.

V. Allombert, Functional Abstraction for Programming Multi-Level Architectures : Formalisation and Implementation, vol.56, p.79, 2017.
URL : https://hal.archives-ouvertes.fr/tel-01693568

V. Allombert, F. Gava, and J. Tesson, Multi-ML: Programming Multi-BSP Algorithms in ML, International Journal of Parallel Programming, vol.59, p.61, 2015.
URL : https://hal.archives-ouvertes.fr/hal-01160164

V. Allombert, F. Gava, and J. Tesson, Toward Performance Prediction for Multi-BSP Programs in ML, International Conference on Algorithms and Architectures for Parallel Processing, p.79, 2018.
URL : https://hal.archives-ouvertes.fr/hal-01941250

R. Alur, J. Devietti, O. S. Navarro-leija, and N. Singhania, GPUDrano: Detecting Uncoalesced Accesses in GPU Programs, Computer Aided Verification, vol.73, p.75, 2017.

J. Anderson, P. J. Burns, D. Milroy, P. Ruprecht, T. Hauser et al., Deploying RMACC Summit: An HPC Resource for the Rocky Mountain Region, Proceedings of the Practice and Experience in Advanced Research Computing 2017 on Sustainability, Success and Impact, vol.17, pp.1-8, 2017.

E. A. Ashcroft, Proving assertions about parallel programs, Journal of Computer and System Sciences, vol.10, issue.1, pp.110-135, 1975.

M. Assaf, From Qualitative to Quantitative Program Analysis: Permissive Enforcement of Secure Information Flow, p.70, 2015.
URL : https://hal.archives-ouvertes.fr/tel-01184857

J. Auslander, M. Philipose, C. Chambers, S. J. Eggers, and B. N. Bershad, Fast, effective dynamic compilation, ACM SIGPLAN Notices, vol.31, p.106, 1996.

O. Ballereau, F. Loulergue, and G. Hains, High-level BSP Programming: BSML and BS?, Proceedings of the First Scottish Functional Programming Workshop, vol.42, p.61, 1999.

E. Bardsley and A. F. Donaldson, Warps and Atomics: Beyond Barrier Synchronization in the Verification of GPU Kernels, NASA Formal Methods, pp.230-245, 2014.

M. Benabderrahmane, L. Pouchet, A. Cohen, and C. Bastoul, The Polyhedral Model Is More Widely Applicable than You Think, International Conference on Compiler Construction, vol.121, p.142, 2010.
URL : https://hal.archives-ouvertes.fr/inria-00551087

Y. Bertot and P. Castéran, Interactive Theorem Proving and Program Development: Coq'Art: The Calculus of Inductive Constructions, vol.65, p.83, 2013.
URL : https://hal.archives-ouvertes.fr/hal-00344237

A. Betts, N. Chong, A. Donaldson, S. Qadeer, and P. Thomson, GPUVerify: A verifier for GPU kernels, In ACM SIGPLAN Notices, vol.47, p.73, 2012.

R. H. Bisseling, Parallel Scientific Computation: A Structured Approach Using BSP and MPI, vol.14, p.114, 2004.

A. Blanchard, N. Kosmatov, M. Lemerre, and F. Loulergue, Conc2Seq: A Frama-C Plugin for Verification of Parallel Compositions of C Programs, 2016 IEEE 16th International Working Conference on Source Code Analysis and Manipulation (SCAM), p.153, 2016.

S. Blazy, D. Bühler, and B. Yakobowski, Structuring Abstract Interpreters Through State and Value Abstractions, Verification, Model Checking, and Abstract Interpretation, pp.112-130, 2017.
URL : https://hal.archives-ouvertes.fr/cea-01808886

O. Bonorden, B. Juurlink, I. V. Otte, and I. Rieping, The Paderborn university BSP (PUB) library-design, implementation and performance, Proceedings 13th International Parallel Processing Symposium and 10th Symposium on Parallel and Distributed Processing. IPPS/SPDP 1999, pp.99-104, 1999.

O. Bonorden, B. Juurlink, I. Otte, and I. Rieping, The Paderborn University BSP (PUB) library, Parallel Computing, vol.29, issue.2, p.184, 2003.

V. Botbol, E. Chailloux, and T. L. Gall, Static Analysis of Communicating Processes Using Symbolic Transducers, International Conference on Verification, Model Checking, and Abstract Interpretation, p.71, 2017.
URL : https://hal.archives-ouvertes.fr/hal-01449374

L. Bougé, The Data-Parallel Programming Model: A Semantic Perspective (Final Version), INRIA, p.67, 1996.

P. Boulet and X. Redon, Communication Pre-evaluation in HPF, Proceedings of the 4th International Euro-Par Conference on Parallel Processing, vol.153, p.154, 1998.
URL : https://hal.archives-ouvertes.fr/inria-00565190

W. Bousdira, F. Loulergue, and J. Tesson, A verified library of algorithmic skeletons on evenly distributed arrays, International Conference on Algorithms and Architectures for Parallel Processing, p.67, 2012.
URL : https://hal.archives-ouvertes.fr/hal-00708822

G. Bronevetsky, Communication-Sensitive Static Dataflow for Parallel Message Passing Applications, Proceedings of the 7th Annual IEEE/ACM International Symposium on Code Generation and Optimization, CGO '09, vol.72, p.74, 2009.

J. Buurlage, T. Bannink, and A. Wits, Bulk-synchronous pseudostreaming algorithms for many-core accelerators, p.39, 2016.

J. Buurlage, T. Bannink, and R. H. Bisseling, Bulk: A Modern C++ Interface for Bulk-Synchronous Parallel Programs, Euro-Par 2018: Parallel Processing, pp.519-532, 2018.

V. Cavé, J. Zhao, J. Shirako, V. Sarkar, and . Habanero-java, The new adventures of old X10, Proceedings of the 9th International Conference on Principles and Practice of Programming in Java, p.74, 2011.

B. Chapman, T. Curtis, S. Pophale, S. Poole, J. Kuehn et al., Introducing OpenSHMEM: SHMEM for the PGAS community, Proceedings of the Fourth Conference on Partitioned Global Address Space Programming Model, vol.41, p.184, 2010.

P. Chatarasi, J. Shirako, M. Kong, and V. Sarkar, An Extended Polyhedral Model for SPMD Programs and Its Use in Static Data Race Detection, Languages and Compilers for Parallel Computing, vol.76, p.153, 2017.

K. C. Chaudhuri, D. Doligez, L. Lamport, and S. Merz, A TLA+ proof system, p.66, 2008.
URL : https://hal.archives-ouvertes.fr/inria-00338299

Y. Chen and J. W. Sanders, Top-down design of bulk-synchronous parallel programs, Parallel Processing Letters, vol.13, issue.03, p.67, 2003.

Y. Chen and J. W. Sanders, Logic of global synchrony, ACM Transactions on Programming Languages and Systems (TOPLAS), vol.26, issue.2, p.67, 2004.

T. Christiansen, L. Wall, and J. Orwant, Programming Perl: Unmatched Power for Text Processing and Scripting, p.6, 2012.

P. Ciechanowicz, M. Poldner, and H. Kuchen, The Münster Skeleton Library Muesli: A comprehensive overview, p.63, 2009.

E. M. Clarke, O. Grumberg, and D. Peled, Model Checking, p.68, 1999.

P. Clauss, Counting Solutions to Linear and Nonlinear Constraints Through Ehrhart Polynomials: Applications to Analyze and Transform Scientific Programs, Proceedings of the 10th International Conference on Supercomputing, ICS '96, vol.78, p.153, 1996.
URL : https://hal.archives-ouvertes.fr/hal-01100306

. Bibliography,

E. Cohen, M. Dahlweid, M. Hillebrand, D. Leinenbach, M. Moskal et al., VCC: A practical system for verifying concurrent C, Theorem Proving in Higher Order Logics, vol.65, p.66, 2009.

M. I. Cole, Algorithmic Skeletons: Structured Management of Parallel Computation, p.62, 1989.

B. Cook, Principles of program termination. Engineering Methods and Tools for Software Safety and Security, vol.22, p.125, 2009.

P. Cousot and R. Cousot, Abstract Interpretation: A Unified Lattice Model for Static Analysis of Programs by Construction or Approximation of Fixpoints, Proceedings of the 4th ACM SIGACT-SIGPLAN Symposium on Principles of Programming Languages, POPL '77, vol.43, p.53, 1977.

P. Cousot and N. Halbwachs, Automatic discovery of linear restraints among variables of a program, Proceedings of the 5th ACM SIGACT-SIGPLAN Symposium on Principles of Programming Languages, pp.84-96

R. Couturier and D. Méry, An experiment in parallelizing an application using formal methods, Computer Aided Verification, pp.345-356
URL : https://hal.archives-ouvertes.fr/inria-00098540

. Springer, , 1998.

D. Culler, R. Karp, D. Patterson, A. Sahay, K. E. Schauser et al., LogP: Towards a realistic model of parallel computation, ACM Sigplan Notices, vol.28, p.57, 1993.

P. Cuoq, P. Hilsenkopf, F. Kirchner, S. Labbé, N. Thuy et al., Formal verification of software important to safety using the Frama-C tool suite, Proceedings of the 8th International Topical Meeting on Nuclear Plant Instrumentation, Control and Human Machine Interface Technologies, p.53, 2012.

P. Cuoq, F. Kirchner, N. Kosmatov, V. Prevosto, J. Signoles et al., Frama-c, Software Engineering and Formal Methods, vol.65, p.66, 2012.
URL : https://hal.archives-ouvertes.fr/hal-02263407

P. Cuoq, F. Kirchner, N. Kosmatov, V. Prevosto, J. Signoles et al., Frama-C, A Program Analysis Perspective, The 10th International Conference on Software Engineering and Formal Methods (SEFM 2012, vol.7504, p.52, 2012.

F. Dabrowski, A Denotational Semantics of Textually Aligned SPMD Programs, International Symposium on Formal Approaches to Parallel and Distributed Systems (4PAD 2018), vol.83, p.116, 2018.
URL : https://hal.archives-ouvertes.fr/hal-01785110

F. Dabrowski, Textual Alignment in SPMD Programs, Proceedings of the 33rd Annual ACM Symposium on Applied Computing, SAC '18, vol.79, p.170, 2018.
URL : https://hal.archives-ouvertes.fr/hal-01559832

N. A. Danielsson, Lightweight Semiformal Time Complexity Analysis for Purely Functional Data Structures, Proceedings of the 35th Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, POPL '08, pp.133-144, 2008.

S. Darabi, S. C. Blom, and M. Huisman, A Verification Technique for Deterministic Parallel Programs, NASA Formal Methods, pp.247-264

C. Springer, , 2017.

F. Darema, SPMD computational model, Encyclopedia of Parallel Computing, p.24, 2011.

L. De-moura and N. Bjørner, Z3: An efficient SMT solver, International Conference on Tools and Algorithms for the Construction and Analysis of Systems, p.65, 2008.

M. Delahaye, N. Kosmatov, and J. Signoles, Common Specification Language for Static and Dynamic Analysis of C Programs, The 28th Annual ACM Symposium on Applied Computing (SAC 2013), pp.1230-1235, 2013.
URL : https://hal.archives-ouvertes.fr/hal-00853721

B. D. Martino, A. Mazzeo, N. Mazzocca, and U. Villano, Parallel Program Analysis and Restructuring by Detection of Point-to-Point Interaction Patterns and Their Transformation into Collective Communication Constructs, Science of Computer Programming, vol.40, issue.2-3, p.141, 2001.

E. W. Dijkstra, Guarded Commands, Nondeterminacy and Formal Derivation of Programs, Communications of the ACM, vol.18, issue.8, p.53, 1975.

. Bibliography,

Y. Dubois and R. Teyssier, On the onset of galactic winds in quiescent star forming galaxies, Astronomy and Astrophysics, vol.477, p.185, 2008.

J. Eloff and M. B. Bella, Software Failures: An Overview, Software Failure Investigation: A Near-Miss Analysis Approach, p.185, 2018.

K. Emoto, F. Loulergue, and J. Tesson, A Verified Generate-Test-Aggregate Coq Library for Parallel Programs Extraction, Interactive Theorem Proving, number 8558 in Lecture Notes in Computer Science, pp.258-274, 2014.
URL : https://hal.archives-ouvertes.fr/hal-00964061

A. Ernstsson, L. Li, and C. Kessler, SkePU 2: Flexible and type-safe skeleton programming for heterogeneous parallel systems, International Journal of Parallel Programming, vol.46, issue.1, p.63, 2018.

J. Filliâtre and A. Paskevich, Why3-where programs meet provers, Programming Languages and Systems, vol.65, p.67, 2013.

C. Flanagan and S. Qadeer, Thread-modular model checking, ternational SPIN Workshop on Model Checking of Software, pp.213-224

. Springer, , p.71, 2003.

L. Flon and N. Suzuki, Consistent and complete proof rules for the total correctness of parallel programs, Foundations of Computer Science, p.65, 1978.

M. J. Flynn, Some computer organizations and their effectiveness, IEEE transactions on computers, vol.100, issue.9, p.25, 1972.

V. Forejt, S. Joshi, D. Kroening, G. Narayanaswamy, and S. Sharma, Precise predictive analysis for discovering communication deadlocks in MPI programs, ACM Transactions on Programming Languages and Systems (TOPLAS), vol.39, issue.4, p.69, 2017.

J. Fortin and F. Gava, Towards Mechanised Semantics of HPC: The BSP with Subgroup Synchronisation Case, Algorithms and Architectures for Parallel Processing, pp.222-237, 2015.

J. Fortin and F. Gava, BSP-Why: A Tool for Deductive Verification of BSP Algorithms with Subgroup Synchronisation, International Journal of Parallel Programming, vol.44, issue.3, p.183, 2016.

S. Fortune and J. Wyllie, Parallelism in random access machines, Proceedings of the Tenth Annual ACM Symposium on Theory of Computing, p.56, 1978.

M. I. Frank, A. Agarwal, and M. K. Vernon, Lopc: Modeling contention in parallel algorithms, Proceedings of the Sixth ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPOPP '97, pp.276-287

M. Frieb, A. Stegmeier, J. Mische, and T. Ungerer, Employing MPI Collectives for Timing Analysis on Embedded Multi-Cores, 16th International Workshop on Worst-Case Execution Time Analysis (WCET 2016), vol.55, pp.1-10, 2016.

Z. Ganjei, A. Rezine, L. Henrio, P. Eles, and Z. Peng, On Reachability in Parameterized Phaser Programs, p.74, 2018.
URL : https://hal.archives-ouvertes.fr/hal-02061520

F. Gava, Formal proofs of functional bsp programs. Parallel Processing Letters, vol.13, pp.365-376, 2003.
URL : https://hal.archives-ouvertes.fr/hal-00005601

F. Gava and J. Fortin, Formal Semantics of a Subset of the Paderborn's BSPlib, Ninth International Conference on Parallel and Distributed Computing, Applications and Technologies, vol.68, p.184, 2008.

F. Gava and F. Loulergue, A static analysis for Bulk Synchronous Parallel ML to avoid parallel nesting, Future Generation Computer Systems, vol.21, issue.5, p.78, 2005.
URL : https://hal.archives-ouvertes.fr/hal-00110830

. Bibliography,

A. V. Gerbessiotis and S. Lee, Remote memory access: A case for portable, efficient and library independent parallel programming, Scientific Programming, vol.12, issue.3, p.43, 2004.

R. Gerstenberger, M. Besta, and T. Hoefler, Enabling Highly-Scalable Remote Memory Access Programming with MPI-3 One Sided, p.24, 2014.

L. Gesbert, Z. Hu, F. Loulergue, K. Matsuzaki, and J. Tesson, Systematic development of correct bulk synchronous parallel programs, Parallel and Distributed Computing, Applications and Technologies (PDCAT), 2010 International Conference On, p.67, 2010.
URL : https://hal.archives-ouvertes.fr/hal-00512867

H. González-vélez and M. Leyton, A survey of algorithmic skeleton frameworks: High-level structured parallel programming enablers. Software: Practice and Experience, vol.40, p.62, 2010.

G. Gopalakrishnan, P. D. Hovland, C. Iancu, S. Krishnamoorthy, I. Laguna et al., Report of the HPC Correctness Summit, p.79, 2017.

S. Gorlatch and M. Cole, Parallel Skeletons, Encyclopedia of Parallel Computing, pp.1417-1422, 2011.

M. W. Goudreau, K. Lang, S. B. Rao, and T. Tsantilas, The green BSP library, Report CS TR, vol.95, issue.11, p.63, 1995.

W. Gropp, T. Hoefler, R. Thakur, and E. Lusk, Using Advanced MPI: Modern Features of the Message-Passing Interface, vol.42, p.188, 2014.

T. Grosser, A. Groesslinger, and C. Lengauer, Polly-performing polyhedral optimizations on a low-level intermediate representation, Parallel Processing Letters, vol.22, issue.04, p.154, 2012.

I. Grudenic and N. Bogunovic, Modeling and verification of MPI based distributed software, European Parallel Virtual Machine/Message Passing Interface Users' Group Meeting, p.69, 2006.

A. Gustavsson, J. Gustafsson, and B. Lisper, Timing Analysis of Parallel Software Using Abstract Execution, Verification, Model Checking, and Abstract Interpretation, pp.59-77, 2014.

G. Hains, Subset synchronization in BSP computing, PDPTA, vol.98, p.15, 1998.

G. Hains, Algorithmes et programmation parallèles : Théorie avec BSP et pratique avec OCaml. Ellipses Marketing, 2018.

G. Hains and A. Domínguez, Real-time parallel routing for telecom networks: Graph algorithms and bulk-synchronous parallel acceleration, IEEE Canadian Conference on Electrical and Computer Engineering (CCECE), p.114, 2016.

G. Hains and F. Loulergue, Functional Bulk Synchronous Parallel Programming using the BSMLlib Library, Second International Workshop on Constructive Methods for Parallel Programming, 2000.

V. Halyo, P. Legresley, and P. Lujan, Massively parallel computing and the search for jets and black holes at the LHC. Nuclear Instruments and Methods in, Physics Research Section A: Accelerators, Spectrometers, Detectors and Associated Equipment, vol.744, pp.54-60, 2014.

Y. Hayashi and M. Cole, Static Performance Prediction of Skeletal Parallel Programs, PARALLEL ALGORITHMS AND APPLICATION, vol.17, issue.1, p.153, 2002.

F. Heine and A. Slowik, Volume Driven Data Distribution for NUMA-Machines, Euro-Par 2000 Parallel Processing, pp.415-424, 2000.

J. M. Hill, B. Mccoll, D. C. Stefanescu, M. W. Goudreau, K. Lang et al., BSPlib: The BSP Programming Library, Parallel Computing, vol.24, issue.14, p.63, 1998.

C. A. Hoare, An axiomatic basis for computer programming, Communications of the ACM, vol.12, issue.10, p.65, 1969.

. Bibliography,

C. A. Hoare, Communicating sequential processes, Communications of the ACM, vol.21, issue.8, p.60, 1978.

T. Hoefler, J. Dinan, R. Thakur, B. W. Barrett, P. Balaji et al., Remote Memory Access Programming in MPI-3. TOPC, vol.2, p.77, 2015.

J. Hoffmann and M. Hofmann, Amortized resource analysis with polynomial potential, European Symposium on Programming, pp.287-306

. Springer, , p.78, 2010.

J. Hoffmann and Z. Shao, Automatic Static Cost Analysis for Parallel Programs, European Symposium on Programming Languages and Systems, vol.78, p.152, 2015.

M. Hofmann and S. Jost, Static prediction of heap space usage for firstorder functional programs, ACM SIGPLAN Notices, vol.38, pp.185-197, 2003.

W. Hu, N. Huang, and T. Chiueh, Software Defined Radio Implementation of an LTE Downlink Transceiver for Ultra Dense Networks, 2018 IEEE International Symposium on Circuits and Systems (ISCAS), p.1, 2018.

Y. Huang and E. Mercer, Detecting MPI Zero Buffer Incompatibility by SMT Encoding, NASA Formal Methods, pp.219-233, 2015.

J. Hückelheim, Z. Luo, S. H. Narayanan, S. Siegel, and P. D. Hovland, Verifying Properties of Differentiable Programs, Static Analysis, pp.205-222, 2018.

A. Jakobsson, Automatic Cost Analysis for Imperative BSP Programs, International Journal of Parallel Programming, vol.47, issue.2, pp.184-212, 2019.
URL : https://hal.archives-ouvertes.fr/hal-01818140

A. Jakobsson, F. Dabrowski, W. Bousdira, F. Loulergue, and G. Hains, Replicated Synchronization for Imperative BSP Programs, Procedia Computer Science, vol.108, pp.535-544, 2017.
URL : https://hal.archives-ouvertes.fr/hal-01494832

. Bibliography,

A. Jakobsson, F. Dabrowski, and W. Bousdira, Safe Usage of Registers in BSPlib, Proceedings of the 34th Annual ACM Symposium on Applied Computing, SAC '19, 2019.
URL : https://hal.archives-ouvertes.fr/hal-01955283

F. A. Jakobsson, Optimized Support of Memory-Related Annotations for Runtime Assertion Checking with Frama-C. Master's thesis, 1952.

N. Javed and F. Loulergue, OSL: Optimized bulk synchronous parallel skeletons on distributed arrays, Advanced Parallel Processing Technologies, p.64, 2009.
URL : https://hal.archives-ouvertes.fr/inria-00452523

B. Jeannet and A. Miné, Apron: A Library of Numerical Abstract Domains for Static Analysis, International Conference on Computer Aided Verification, p.147, 2009.
URL : https://hal.archives-ouvertes.fr/hal-00786354

T. E. Jeremiassen and S. J. Eggers, Static analysis of barrier synchronization in explicitly parallel programs, IFIP PACT, vol.72, p.74, 1994.

H. Jifeng, Q. Miller, and L. Chen, Algebraic laws for BSP programming, Euro-Par'96 Parallel Processing, number 1124 in Lecture Notes in Computer Science, pp.359-368, 1996.

A. Jones, R. Melhem, and S. Shao, A compiler-based communication analysis approach for multiprocessor systems, Parallel and Distributed Processing Symposium, International(IPDPS), p.65, 2006.

G. Jones and M. Goldsmith, Programming in Occam, p.60, 1987.

B. H. Juurlink and H. A. Wijshoff, A Quantitative Comparison of Parallel Computation Models, Proceedings of the Eighth Annual ACM Symposium on Parallel Algorithms and Architectures, SPAA '96, pp.13-24, 1996.

. Bibliography,

A. Kamil and K. Yelick, Concurrency Analysis for Parallel Programs with Textually Aligned Barriers, Proceedings of the 18th International Conference on Languages and Compilers for Parallel Computing, LCPC'05, vol.6, p.117, 2006.

A. Kamil, K. R. Yelick-;-g, L. L. Gao, J. Pollock, X. Cavazos et al., Enforcing Textual Alignment of Collectives Using Dynamic Checks, Languages and Compilers for Parallel Computing, pp.368-382, 2009.

R. M. Keller, Formal Verification of Parallel Programs, Commun. ACM, vol.19, issue.7, pp.371-384, 1976.

I. Laguna and M. Schulz, Pinpointing scale-dependent integer overflow bugs in large-scale parallel applications, Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, p.19, 2016.

L. Lamport, The 'Hoare logic'of concurrent programs, Acta Informatica, vol.14, issue.1, p.65, 1980.

J. K. Lee, J. Palsberg, R. Majumdar, and H. Hong, Efficient may happen in parallel analysis for async-finish parallelism, International Static Analysis Symposium, p.74, 2012.

C. Lengauer, Loop Parallelization in the Polytope Model, International Conference on Concurrency Theory, p.153, 1993.

C. Li and G. Hains, A simple bridging model for high-performance computing, High Performance Computing and Simulation (HPCS), 2011 International Conference On, vol.56, p.58, 2011.
URL : https://hal.archives-ouvertes.fr/hal-00926383

G. Li, R. Palmer, M. Delisi, G. Gopalakrishnan, and R. M. Kirby, Formal Specification of MPI 2.0: Case Study in Specifying a Practical Concurrent Programming API, Sci. Comput. Program, vol.76, issue.2, p.69, 2011.

Y. Lin, Static Nonconcurrency Analysis of OpenMP Programs, OpenMP Shared Memory Parallel Programming, pp.36-50, 2008.

B. Lisper, Towards Parallel Programming Models for Predictability, 12th International Workshop on Worst-Case Execution Time Analysis, vol.23, pp.48-58, 2012.

J. Liu, J. Wu, and D. K. Panda, High Performance RDMA-Based MPI Implementation over InfiniBand, International Journal of Parallel Programming, vol.32, issue.3, pp.167-198, 2004.

F. Loulergue, ;. Valero, K. Joe, M. Kitsuregawa, and H. Tanaka, High Performance Computing, number 1940 in Lecture Notes in Computer Science, vol.64, p.67, 2000.

F. Loulergue, A Verified Accumulate Algorithmic Skeleton, 2017 Fifth International Symposium on Computing and Networking (CANDAR), pp.420-426, 2017.
URL : https://hal.archives-ouvertes.fr/hal-02317096

G. Malewicz, M. H. Austern, A. J. Bik, J. C. Dehnert, I. Horn et al., Pregel: A system for large-scale graph processing, Proceedings of the 2010 ACM SIGMOD International Conference on Management of Data, p.63, 2010.

H. Markram, The blue brain project, Nature Reviews Neuroscience, vol.7, issue.2, p.185, 2006.

J. Marshall, A. Adcroft, C. Hill, L. Perelman, and C. Heisey, A finitevolume, incompressible Navier Stokes model for studies of the ocean on parallel computers, Journal of Geophysical Research: Oceans, vol.102, issue.C3, p.185, 1997.

A. J. Mcpherson, V. Nagarajan, and M. Cintra, Static Approximation of MPI Communication Graphs for Optimized Process Placement, Lan-Bibliography guages and Compilers for Parallel Computing, pp.268-283, 2014.

, Message Passing Interface Forum. MPI: A Message-Passing Interface Standard, Version 3.0. High Performance Computing Center Stuttgart (HLRS), vol.33, p.184, 1924.

J. Midtgaard, F. Nielson, and H. R. Nielson, Process-local static analysis of synchronous processes, International Static Analysis Symposium, p.72, 2018.

R. Miller, A library for bulk-synchronous parallel programming, Proceedings of the BCS Parallel Processing Specialist Group Workshop on General Purpose Parallel Computing, vol.63, p.114, 1993.

R. Milner, A Calculus of Communicating Systems, 1980.

A. Miné, Relational Thread-Modular Static Value Analysis by Abstract Interpretation, Verification, Model Checking, and Abstract Interpretation, pp.39-58, 2014.

M. W. Moskewicz, C. F. Madigan, Y. Zhao, L. Zhang, and S. Malik, Chaff: Engineering an Efficient SAT Solver, Proceedings of the 38th Annual Design Automation Conference, DAC '01, pp.530-535, 2001.

,

M. Naik and A. Aiken, Conditional must not aliasing for static race detection, ACM SIGPLAN Notices, vol.42, p.76, 2007.

R. Nakade, E. Mercer, P. Aldous, and J. Mccarthy, Model-Checking Task Parallel Programs for Data-Race, NASA Formal Methods, pp.367-382

. Springer, , 2018.

G. C. Necula, S. Mcpeak, S. P. Rahul, and W. Weimer, CIL: Intermediate language and tools for analysis and transformation of C programs, Compiler Construction, p.53, 2002.

N. Ng, N. Yoshida, O. Pernet, R. Hu, and Y. Kryftis, Safe parallel programming with session java, International Conference on Coordination Languages and Models, p.75, 2011.
URL : https://hal.archives-ouvertes.fr/hal-01582999

N. Ng, N. Yoshida, and K. Honda, Multiparty Session C: Safe parallel programming with message optimisation, International Conference on Modelling Techniques and Tools for Computer Performance Evaluation, p.75, 2012.

N. Ng, N. Yoshida, and W. Luk, Scalable Session Programming for Heterogeneous High-Performance Systems, Software Engineering and Formal Methods, pp.82-98, 2013.

F. Nielson, H. R. Nielson, and C. Hankin, Principles of Program Analysis, vol.44, p.147, 2004.

H. R. Nielson and F. Nielson, Semantics with Applications, p.193, 2007.

S. Owicki and D. Gries, Verifying Properties of Parallel Programs: An Axiomatic Approach, Commun. ACM, vol.19, issue.5, pp.279-285, 1976.

H. Ozaktas, C. Rochange, and P. Sainrat, Automatic wcet analysis of realtime parallel applications, 13th Workshop on Worst-Case Execution Time Analysis (WCET 2013), p.77, 2013.
URL : https://hal.archives-ouvertes.fr/hal-01239727

S. Pelagatti, Structured Development of Parallel Programs, vol.102, p.62, 1998.

A. Pnueli, The temporal logic of programs, Foundations of Computer Science, p.68, 1977.

A. Podelski and A. Rybalchenko, A complete method for the synthesis of linear ranking functions, Verification, Model Checking, and Abstract Interpretation, p.125, 2004.

S. Pophale, O. Hernandez, S. Poole, and B. M. Chapman, Extending the OpenSHMEM Analyzer to Perform Synchronization and Multi-valued Bibliography Analysis, OpenSHMEM and Related Technologies. Experiences, Implementations, and Tools, number 8356 in Lecture Notes in Computer Science, pp.134-148, 2014.

R. Preissl, T. Köckerbauer, M. Schulz, D. Kranzlmüller, B. R. Supinski et al., Detecting Patterns in MPI Communication Traces, 37th International Conference on Parallel Processing, pp.230-237, 2008.

R. Preissl, M. Schulz, D. Kranzlmüller, B. R. De-supinski, and D. J. Quinlan, Using MPI Communication Patterns to Guide Source Code Transformations, Computational Science -ICCS 2008, number 5103 in Lecture Notes in Computer Science, pp.253-260, 2008.

J. Randmets, Static cost analysis, p.125, 2012.

H. G. Rice, Classes of recursively enumerable sets and their decision problems, Transactions of the American Mathematical Society, vol.74, issue.2, pp.358-366, 1953.

M. Rinard, Analysis of multithreaded programs, Static Analysis, p.77, 2001.

D. M. Ritchie, B. W. Kernighan, and M. E. Lesk, The C Programming Language, p.161, 1988.

E. Saillard, P. Carribault, and D. Barthou, PARCOACH: Combining static and dynamic validation of MPI collective communications. The International Journal of High Performance Computing Applications, vol.28, p.79, 2014.
URL : https://hal.archives-ouvertes.fr/hal-01078762

C. Santos, F. Martins, and V. T. Vasconcelos, Deductive Verification of Parallel Programs Using Why3, Electronic Proceedings in Theoretical Computer Science, vol.189, p.68, 2015.

. Bibliography,

V. Saraswat, B. Bloom, I. Peshansky, O. Tardieu, and D. Grove, X10 language specification. Specification, IBM, janvier, p.73, 2012.

S. Sharma, S. Vakkalanka, G. Gopalakrishnan, R. M. Kirby, R. Thakur et al., A formal approach to detect functionally irrelevant barriers in MPI programs, Recent Advances in Parallel Virtual Machine and Message Passing Interface, p.69, 2008.

D. R. Shires, L. L. Pollock, and S. Sprenkle, Program flow graph construction for static analysis of MPI programs, Proceedings of the International Conference on Parallel and Distributed Processing Techniques and Applications, PDPTA 1999, vol.71, p.72, 1999.

S. F. Siegel, Efficient Verification of Halting Properties for MPI Programs with Wildcard Receives, Model Checking, and Abstract Interpretation, p.413

S. F. Siegel, Model Checking Nonblocking MPI Programs, Verification, Model Checking, and Abstract Interpretation, number 4349 in Lecture Notes in Computer Science, pp.44-58, 2007.

S. F. Siegel, Verifying Parallel Programs with MPI-Spin, Recent Advances in Parallel Virtual Machine and Message Passing Interface, pp.13-14, 2007.

S. F. Siegel and G. S. Avrunin, Verification of Halting Properties for MPI Programs Using Nonblocking Operations, Recent Advances in Parallel Virtual Machine and Message Passing Interface, pp.326-334

S. F. Siegel and T. K. Zirkel, TASS: The Toolkit for Accurate Scientific Software, Mathematics in Computer Science, vol.5, issue.4, pp.395-426, 2011.

. Bibliography,

S. F. Siegel, T. K. Zirkel, ;. Hutchison, T. Kanade, J. Kittler et al., Loop Invariant Symbolic Execution for Parallel Programs, Verification, Model Checking, and Abstract Interpretation, vol.7148, p.187, 2012.

A. Spector and D. Gifford, The Space Shuttle Primary Computer System, Commun. ACM, vol.27, issue.9, pp.872-900, 1984.

G. Staple and K. Werbach, The end of spectrum scarcity [spectrum allocation and utilization, IEEE spectrum, vol.41, issue.3, p.1, 2004.

T. Sterling, M. Anderson, and M. Brodowicz, Chapter 8 -The Essential MPI, High Performance Computing, pp.249-284, 2018.

A. Stewart, M. Clint, and J. Gabarró, Axiomatic frameworks for developing BSP-style programs, Parallel Algorithms And Application, vol.14, issue.4, p.67, 2000.

M. M. Strout, B. Kreaseck, and P. D. Hovland, Data-flow analysis for MPI programs, ICPP 2006. International Conference On, vol.71, p.72, 2006.

W. Suijlen, Mock BSPlib for Testing and Debugging Bulk Synchronous Parallel Software, Parallel Processing Letters, vol.27, issue.01, p.79, 2017.

W. J. Suijlen, P. Krusche, and . Bsponmpi, , p.39, 2013.

W. J. Suijlen and A. N. Yzelman, Lightweight Parallel Foundations: A new communication layer, Huawei Technologies France / 2012 Laboratories / CSI / DPSL / PADAL, 20 Quai du Point du Jour, vol.63, p.102, 2018.

J. Tesson and F. Loulergue, Formal Semantics of DRMA-Style Programming in BSPlib, Parallel Processing and Applied Mathematics, vol.4967, p.184, 2008.

J. Tesson and F. Loulergue, A verified bulk synchronous parallel ML heat diffusion simulation, Procedia Computer Science, vol.4, p.67, 2011.
URL : https://hal.archives-ouvertes.fr/hal-00588894

A. Tiskin, The Design and Analysis of Bulk-Synchronous Parallel Algorithms, vol.4, p.59, 1998.

S. Tripakis, C. Stergiou, and R. Lublinerman, Checking Equivalence of SPMD Programs Using Non-Interference, CALIFORNIA UNIV BERKELEY DEPT OF ELECTRICAL ENGI-NEERING AND COMPUTER SCIENCE, p.74, 2010.

A. Turing, Checking a large routine, The Early British Computer Conferences, p.65, 1989.

S. Vakkalanka, G. Gopalakrishnan, and R. M. Kirby, Dynamic verification of MPI programs with reductions in presence of split operations and relaxed orderings, International Conference on Computer Aided Verification, p.69, 2008.

S. S. Vakkalanka, S. Sharma, G. Gopalakrishnan, and R. M. Kirby, ISP: A Tool for Model Checking MPI Programs, Proceedings of the 13th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPoPP '08, pp.285-286, 2008.

L. G. Valiant, A Bridging Model for Parallel Computation, Commun. ACM, vol.33, issue.8, p.56, 1990.

L. G. Valiant, A Bridging Model for Multi-core Computing, J. Comput. Syst. Sci, vol.77, issue.1, p.59, 2011.

M. Van-duijn, Extending the BSP model to hierarchical heterogeneous architectures, p.39, 2018.

S. Verdoolaege, Isl: An Integer Set Library for the Polyhedral Model, ternational Congress on Mathematical Software, vol.143, p.147, 2010.

. Bibliography,

S. Verdoolaege and T. Grosser, Polyhedral Extraction Tool, Second International Workshop on Polyhedral Compilation Techniques (IMPACT'12), p.141, 2012.

S. Verdoolaege, R. Seghir, K. Beyls, V. Loechner, and M. Bruynooghe, Counting Integer Points in Parametric Polytopes Using Barvinok's Rational Functions, Algorithmica, vol.48, issue.1, p.147, 2007.

J. S. Vetter and B. R. De-supinski, Dynamic Software Testing of MPI Applications with Umpire, Proceedings of the 2000 ACM/IEEE Conference on Supercomputing, SC '00, 2000.

A. Vo, S. Vakkalanka, M. Delisi, G. Gopalakrishnan, R. M. Kirby et al., Formal verification of practical MPI programs, ACM Sigplan Notices, vol.44, issue.4, p.69, 2009.

P. Wang, Y. Du, H. Fu, X. Yang, and H. Zhou, Static Analysis for Application-Level Checkpointing of MPI Programs, 10th IEEE International Conference on High Performance Computing and Communications, pp.548-555, 2008.

B. Wegbreit, Mechanical Program Analysis, Commun. ACM, vol.18, issue.9, p.121, 1975.

R. Wilhelm, J. Engblom, A. Ermedahl, N. Holsti, S. Thesing et al., The Worst-Case Execution-Time Problem-overview of Methods and Survey of Tools, ACM Transactions on Embedded Computing Systems (TECS), vol.7, issue.3, p.147, 2008.

N. Williams, B. Marre, P. Mouy, M. Roger, and . Pathcrawler, Automatic generation of path tests by combining static and dynamic analysis, Dependable Computing -EDCC 5, vol.3463, pp.281-292, 2005.
URL : https://hal.archives-ouvertes.fr/hal-01810201

H. Springer-berlin,

G. Winskel, The Formal Semantics of Programming Languages: An Introduction, vol.45, p.165, 1993.

K. Yelick, L. Semenzato, G. Pike, C. Miyamoto, B. Liblit et al., Titanium: A highperformance Java dialect, Concurrency: Practice and Experience, vol.10, p.6, 1998.

T. Yuki, P. Feautrier, S. Rajopadhye, and V. Saraswat, Array dataflow analysis for polyhedral X10 programs, ACM SIGPLAN Notices, vol.48, p.76, 2013.
URL : https://hal.archives-ouvertes.fr/hal-00761537

T. Yuki, P. Feautrier, S. Rajopadhye, and V. Saraswat, Checking Race Freedom of Clocked X10 Programs, vol.74, p.76, 2013.
URL : https://hal.archives-ouvertes.fr/hal-00907723

A. Yzelman and R. H. Bisseling, An object-oriented bulk synchronous parallel library for multicore programming, Concurrency and Computation: Practice and Experience, vol.24, issue.5, p.183, 2012.

A. N. Yzelman, R. H. Bisseling, and D. Roose, MulticoreBSP for C: A High-Performance Library for Shared-Memory Parallel Programming, International Journal of Parallel Programming, vol.42, issue.4, p.63, 2014.

Y. Zhang and E. Duesterwald, Barrier Matching for Programs with Textually Unaligned Barriers, Proceedings of the 12th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPoPP '07, vol.75, p.187, 2007.

Y. Zhang, E. Duesterwald, and G. R. Gao, Concurrency Analysis for Shared Memory Programs with Textually Unaligned Barriers, Languages and Compilers for Parallel Computing, pp.95-109, 2008.

M. Zheng, M. S. Rogers, Z. Luo, M. B. Dwyer, and S. F. Siegel, CIVL: Formal verification of parallel programs, 30th IEEE/ACM International Conference On, pp.830-835, 2015.

, IEEE, vol.69, p.79, 2015.

J. Zhou, Y. Chen, ;. Hutchison, T. Kanade, J. Kittler et al., Generating C Code from LOGS Specifications, Theoretical Aspects of Computing -ICTAC 2005, vol.3722, pp.195-210, 2005.

W. Zimmermann, Automatic Worst Case Complexity Analysis of Parallel Programs. International Computer Science Institute, vol.78, p.152, 1990.

S. F. Ziqing-luo, Towards Deductive Verification of Message-Passing Parallel Programs, p.67, 2018.

T. K. Zirkel, S. F. Siegel, and T. Mcclory, Automated Verification of Chapel Programs Using Model Checking and Symbolic Execution, NASA Formal Methods, vol.69, p.70, 2013.

, This document has been written in the GNU Emacs editor and the L A T E X 2 ? document preparation system