Numerical linear algebra on emerging architectures: The PLASMA and MAGMA projects, Journal of Physics: Conference Series, vol.180, 2009. ,
DOI : 10.1088/1742-6596/180/1/012037
LAPACK: a portable linear algebra library for highperformance computers, Proceedings Supercomputing '90, pp.2-11, 1990. ,
Communication-optimal parallel algorithm for strassen's matrix multiplication, Proceedinbgs of the 24th ACM symposium on Parallelism in algorithms and architectures, SPAA '12 ,
DOI : 10.1145/2312005.2312044
Discrete Logarithm in GF(2809) with FFS, Public-Key Cryptography -PKC 2014 -17th International Conference on Practice and Theory in Public-Key Cryptography, pp.221-238, 2014. ,
DOI : 10.1007/978-3-642-54631-0_13
URL : https://hal.archives-ouvertes.fr/hal-00818124
A framework for practical parallel fast matrix multiplication, Proceedings of the 20th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPoPP 2015, pp.42-53, 2015. ,
DOI : 10.1145/2688500.2688513
URL : http://arxiv.org/pdf/1409.2908
ScaLAPACK Users' Guide, 1997. ,
DOI : 10.1137/1.9780898719642
Cilk: Efficient Multithreaded Computing, 1996. ,
DOI : 10.1006/jpdc.1996.0107
OpenMP Application Program Interface version 3, 2008. ,
OpenMP Application Program Interface version 4, 2013. ,
The Magma Algebra System I: The User Language, Journal of Symbolic Computation, vol.24, issue.3-4, pp.235-265, 1996. ,
DOI : 10.1006/jsco.1996.0125
The Magma algebra system. I. The user language Computational algebra and number theory, J. Symbolic Comput, vol.24, pp.3-4, 1993. ,
Groupes et Algègres de Lie. Elements of mathematics Chapters 4?6, 2008. ,
FFLAS- FFPACK: Finite Field Linear Algebra Subroutines ,
LinBox-1.4.0: Exact Computational Linear Algebra. url: https ,
Exact sparse matrix-vector multiplication on GPU's and multicore architectures, Proceedings of the 4th International Workshop on Parallel and Symbolic Computation, PASCO '10, pp.80-88, 2010. ,
DOI : 10.1145/1837210.1837224
URL : http://hal.archives-ouvertes.fr/docs/00/47/51/85/PDF/ffspmv.pdf
Memory efficient scheduling of Strassen-Winograd's matrix multiplication algorithm, Proceedings of the 2009 international symposium on Symbolic and algebraic computation, ISSAC '09, pp.55-62, 2009. ,
DOI : 10.1145/1576702.1576713
libKOMP, an Efficient OpenMP Runtime System for Both Fork-Join and Data Flow Paradigms, OpenMP in a Heterogeneous World -8th International Workshop on OpenMP, IWOMP 2012, pp.102-115, 2012. ,
DOI : 10.1007/978-3-642-30961-8_8
URL : https://hal.archives-ouvertes.fr/hal-00796253
Sur les repr??sentations induites des groupes de Lie, Bulletin de la Société mathématique de France, vol.79, issue.86911, pp.97-205, 1956. ,
DOI : 10.24033/bsmf.1469
Triangular factorization and inversion by fast matrix multiplication, In: Mathematics of Computation, vol.28125, pp.231-236, 1974. ,
DOI : 10.21236/ad0754790
A class of parallel tiled linear algebra algorithms for multicore architectures, Parallel Computing, vol.35, issue.1, pp.38-53, 2009. ,
DOI : 10.1016/j.parco.2008.10.002
A BLAS based C library for exact linear algebra on integer matrices, Proceedings of the 2005 international symposium on Symbolic and algebraic computation , ISSAC '05, pp.92-99, 2005. ,
DOI : 10.1145/1073884.1073899
Cache-only memory architectures, Computer, vol.32, issue.6, pp.72-79, 1999. ,
DOI : 10.1109/2.769448
URL : http://iacoma.cs.uiuc.edu/iacoma-papers/encyclopedia_coma.pdf
Exploiting Parallelism in Matrix-computation Kernels for Symmetric Multiprocessor Systems: Matrix-multiplication and Matrix-addition Algorithm Optimizations by Software Pipelining and Threads Allocation, In: ACM Trans. Math. Softw, vol.38, issue.2, pp.1-2, 2011. ,
Sur quelques algorithmes de recherche de valeurs propres, 1973. ,
URL : https://hal.archives-ouvertes.fr/tel-00010274
A survey of recent developments in parallel implementations of Gaussian elimination, Concurrency and Computation: Practice and Experience, pp.1292-1309, 2015. ,
DOI : 10.1109/CLUSTR.2007.4629221
URL : https://hal.archives-ouvertes.fr/hal-00986948
Numerical Linear Algebra for High-Performance Computers, 1998. ,
DOI : 10.1137/1.9780898719611
Achieving Numerical Accuracy and High Performance using Recursive Tile LU Factorization, Concurrency and Computation: Practice and Experience, vol.267, 2014. ,
DOI : 10.1002/cpe.3110
URL : https://hal.archives-ouvertes.fr/hal-00865472
GEMMW: A Portable Level 3 BLAS Winograd Variant of Strassen's Matrix-Matrix Multiply Algorithm, Journal of Computational Physics, vol.110, issue.1, pp.1-10, 1994. ,
DOI : 10.1006/jcph.1994.1001
Direct methods for sparse matrices, 1986. ,
DOI : 10.1093/acprof:oso/9780198508380.001.0001
Simultaneous modular reduction and Kronecker substitution for small finite fields Special Issue in Honour of Keith Geddes on his 60th Birthday, Journal of Symbolic Computation, vol.467, pp.823-840, 2011. ,
Givaro-4.0.1: une bibliothèque C++ pour le Calcul Formel. Software IMAG-LMC, 2004. ,
Recursion based parallelization of exact dense linear algebra routines for Gaussian elimination, Parallel Computing, vol.57, 2015. ,
DOI : 10.1016/j.parco.2015.10.003
URL : https://hal.archives-ouvertes.fr/hal-01084238
Parallel Computation of Echelon Forms, Euro-Par 2014 Parallel Processing -20th International Conference, pp.499-510, 2014. ,
DOI : 10.1007/978-3-319-09873-9_42
URL : https://hal.archives-ouvertes.fr/hal-00947013
FFPACK, Proceedings of the 2004 international symposium on Symbolic and algebraic computation , ISSAC '04, pp.119-126, 2004. ,
DOI : 10.1145/1005285.1005304
URL : https://hal.archives-ouvertes.fr/hal-00018223
Dense Linear Algebra over Word-Size Prime Fields, ACM Transactions on Mathematical Software, vol.35, issue.3, pp.1-19, 2008. ,
DOI : 10.1145/1391989.1391992
URL : https://hal.archives-ouvertes.fr/hal-00018223
Computational linear algebra over finite fields In: Handbook of Finite Fields, 2013. ,
Adaptive Triangular System Solving In: Challenges in Symbolic Computation Software, Dagstuhl Seminar Proceedings 06271. Dagstuhl, Germany: Internationales Begegnungsund Forschungszentrum für Informatik (IBFI), Schloss Dagstuhl, 2006. ,
Simultaneous computation of the row and column rank profiles, Proceedings of the 38th international symposium on International symposium on symbolic and algebraic computation, ISSAC '13, 2013. ,
DOI : 10.1145/2465506.2465517
URL : https://hal.archives-ouvertes.fr/hal-00778136
Computing the Rank Profile Matrix, Proceedings of the 2015 ACM on International Symposium on Symbolic and Algebraic Computation, ISSAC '15, pp.149-156, 2015. ,
DOI : 10.1007/978-3-642-15274-0_16
URL : https://hal.archives-ouvertes.fr/hal-01107722
Efficient computation of the characteristic polynomial, Proceedings of the 2005 international symposium on Symbolic and algebraic computation , ISSAC '05, pp.140-147, 2005. ,
DOI : 10.1145/1073884.1073905
URL : https://hal.archives-ouvertes.fr/hal-00004056
On parallel block algorithms for exact triangularizations, Parallel Computing, vol.28, issue.11, pp.1531-1548, 2002. ,
DOI : 10.1016/S0167-8191(02)00161-8
URL : http://www-id.imag.fr/~jgdumas/Publications/Turbo.ps.gz
A new efficient algorithm for computing Gröbner bases (F4), Journal of Pure and Applied Algebra, vol.139, issue.99, pp.1-3, 1999. ,
Modern Computer Algebra, 1999. ,
KAAPI, Proceedings of the 2007 international workshop on Parallel symbolic computation, PASCO '07, pp.27-28, 2007. ,
DOI : 10.1145/1278177.1278182
URL : https://hal.archives-ouvertes.fr/hal-00727795
XKaapi: A Runtime System for Data-Flow Task Programming on Heterogeneous Architectures, 2013 IEEE 27th International Symposium on Parallel and Distributed Processing, pp.1299-1308, 2013. ,
DOI : 10.1109/IPDPS.2013.66
URL : https://hal.archives-ouvertes.fr/hal-00799904
Parallel algebraic linear algebra dedicated interface, Proceedings of the 2015 International Workshop on Parallel Symbolic Computation, PASCO '15, pp.34-43, 2015. ,
DOI : 10.1145/1278177.1278182
URL : https://hal.archives-ouvertes.fr/hal-01221106
Anatomy of high-performance matrix multiplication, ACM Transactions on Mathematical Software, vol.34, issue.3, pp.1-12, 2008. ,
DOI : 10.1145/1356052.1356053
URL : http://www.cs.utexas.edu/users/flame/pubs/GOTO_TOMS.ps
Analogy of Bruhat decomposition for the closure of a cone of Chevalley group of a classical serie, In: Soviet Mathematics Doklady, vol.23, issue.2, pp.393-397, 1981. ,
Additive complexity in directed computations, In: Theoretical Computer Science, vol.1982, pp.39-670304, 1982. ,
CALU: A Communication Optimal LU Factorization Algorithm, SIAM Journal on Matrix Analysis and Applications, vol.32, issue.4, pp.1317-1350, 2011. ,
DOI : 10.1137/100788926
URL : https://hal.archives-ouvertes.fr/hal-00651137
Recursion leads to automatic variable blocking for dense linear-algebra algorithms, IBM Journal of Research and Development, vol.41, issue.6, pp.737-756, 1997. ,
DOI : 10.1147/rd.416.0737
Recursive blocked data formats and BLAS???s for dense linear algebra algorithms, In: PARA. Ed. by B. Kagstrom, J. Dongarra, E. Elmroth, and J. Wasniewski. Lecture Notes in Computer Science, vol.1541, pp.195-206, 1998. ,
DOI : 10.1007/BFb0095337
FLINT: Fast Library for Number Theory ,
DOI : 10.1007/978-3-642-15582-6_18
URL : http://wrap.warwick.ac.uk/41629/1/WRAP_Hart_0584144-ma-270913-flint-extended-abstract.pdf
An overview of the Trilinos project, ACM Transactions on Mathematical Software, vol.31, issue.3, pp.397-423, 2005. ,
DOI : 10.1145/1089014.1089021
A generalization of the fast LUP matrix decomposition algorithm and applications, Journal of Algorithms, vol.3, issue.1, pp.45-56, 1982. ,
DOI : 10.1016/0196-6774(82)90007-4
Intel Math Kernel Library, 2007. ,
Rank-profile revealing Gaussian elimination and the CUP matrix decomposition, Journal of Symbolic Computation, vol.56, 2013. ,
DOI : 10.1016/j.jsc.2013.04.004
URL : https://hal.archives-ouvertes.fr/hal-00655543
LU factoring of non-invertible matrices, ACM SIGSAM Bulletin, vol.44, issue.1/2, pp.1-8, 2010. ,
DOI : 10.1145/1838599.1838602
The GNU OpenMP implementation. 2014. url: https ,
Fast algorithms for the characteristics polynomial, Theoretical Computer Science, vol.36, pp.309-317, 1985. ,
DOI : 10.1016/0304-3975(85)90049-0
Iterative MethodsforLinearand Nonlinear Equations, 1995. ,
Anatomy of a Parallel Out-of-Core Dense Linear Solver, In: ICPP, vol.3, pp.29-33, 1995. ,
A tensor product formulation of Strassen's matrix multiplication algorithm with memory reduction, Parallel Processing Symposium Proceedings of Seventh International, pp.582-588, 1993. ,
Scheduling dense linear algebra operations on multicore processors, Concurrency and Computation: Practice and Experience, pp.15-44, 2010. ,
DOI : 10.1137/1.9781611971446
Multithreading in the PLASMA Library, Multicore Computing: Algorithms, Architectures, and Applications, p.119, 2013. ,
URL : https://hal.archives-ouvertes.fr/hal-00809774
Powers of tensors and fast matrix multiplication, Proceedings of the 39th International Symposium on Symbolic and Algebraic Computation, ISSAC '14, pp.296-303, 2014. ,
DOI : 10.1145/2608628.2608664
Fast Generalized Bruhat Decomposition, CASC'10, pp.194-202, 2010. ,
DOI : 10.1109/SFCS.1992.267779
Bruhat canonical form for linear systems, Linear Algebra and its Applications, vol.425, issue.2-3, pp.2-3, 2007. ,
DOI : 10.1016/j.laa.2007.01.022
Combinatorial commutative algebra, 2005. ,
An LLL-reduction algorithm with quasi-linear time complexity, Proceedings of the 43rd annual ACM symposium on Theory of computing, STOC '11, pp.403-412, 2011. ,
DOI : 10.1145/1993636.1993691
URL : https://hal.archives-ouvertes.fr/ensl-00534899
Elemental, ACM Transactions on Mathematical Software, vol.39, issue.2, pp.1-1324, 2013. ,
DOI : 10.1145/2427023.2427030
A library for Number Theory ,
The matrix template library: A generic programming approach to high performance numerical linear algebra " . In: Computing in Object-Oriented Parallel Environments, pp.59-70, 1998. ,
Modular forms, a computational approach Graduate studies in mathematics, 2007. ,
Algorithms for Matrix Canonical Forms, pp.10-3929, 2000. ,
Gaussian elimination is not optimal, Numerische Mathematik, vol.13, issue.4, pp.354-356, 1969. ,
DOI : 10.1007/BF02165411
Locality of Reference in LU Decomposition with Partial Pivoting, SIAM Journal on Matrix Analysis and Applications, vol.18, issue.4, pp.1065-1081, 1997. ,
DOI : 10.1137/S0895479896297744
Calcul formel et parallélisme: résolution de systèmes linéaires, 1988. ,
Intel Math Kernel Library, High-Performance Computing on the Intel R Xeon Phi, pp.167-188, 2014. ,
DOI : 10.1007/978-3-319-06486-4_7
AUGEM, Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis on, SC '13, pp.1-25, 2013. ,
DOI : 10.1145/2503210.2503219
Automated empirical optimizations of software and the {ATLAS} project New Trends in High Performance Computing, Parallel Computing, vol.271, issue.200, pp.3-35, 2001. ,
Multiplying matrices faster than coppersmith-winograd, Proceedings of the 44th symposium on Theory of Computing, STOC '12, pp.887-898, 2012. ,
DOI : 10.1145/2213977.2214056
On multiplication of 2 ?? 2 matrices, Linear Algebra and its Applications, vol.4, issue.4, pp.381-388, 1971. ,
DOI : 10.1016/0024-3795(71)90009-7
Model-driven Level 3 BLAS Performance Optimization on Loongson 3A Processor, 2012 IEEE 18th International Conference on Parallel and Distributed Systems, pp.684-691, 2012. ,
DOI : 10.1109/ICPADS.2012.97
Perturbation analysis of the QR Factor R in the context of LLL lattice basis reduction In: () doi: http://dx.doi.org/10.1090/S0025-5718-2012- 02545-2. References References 41 READWRITE ( An, 43 References References 38 TASK ( MODE ( READ ( Q1 ) CONSTREFERENCE ( Fi , Q1 , A3 ) READWRITE ( A3, pp.42-44 ,