Communication-Avoiding QR Decomposition for GPUs, 2011 IEEE International Parallel & Distributed Processing Symposium, 2011. ,
DOI : 10.1109/IPDPS.2011.15
A Partial Condition Number for Linear Least Squares Problems, SIAM Journal on Matrix Analysis and Applications, vol.29, issue.2, pp.413-433, 2007. ,
DOI : 10.1137/050643088
Solving Sparse Linear Systems with Sparse Backward Error, SIAM Journal on Matrix Analysis and Applications, vol.10, issue.2, pp.165-190, 1989. ,
DOI : 10.1137/0610013
Accurate Symmetric Indefinite Linear Equation Solvers, SIAM Journal on Matrix Analysis and Applications, vol.20, issue.2, pp.513-561, 1998. ,
DOI : 10.1137/S0895479896296921
Blendenpik: Supercharging LAPACK's Least-Squares Solver, SIAM Journal on Scientific Computing, vol.32, issue.3, pp.1217-1236, 2010. ,
DOI : 10.1137/090767911
A Parallel Tiled Solver for Dense Symmetric Indefinite Systems on Multicore Architectures, 2012 IEEE 26th International Parallel and Distributed Processing Symposium, pp.2012-2026, 2012. ,
DOI : 10.1109/IPDPS.2012.12
URL : https://hal.archives-ouvertes.fr/inria-00631361
Accelerating scientific computations with mixed precision algorithms, Computer Physics Communications, vol.180, issue.12, pp.2526-2533, 2009. ,
DOI : 10.1016/j.cpc.2008.11.005
A class of communication-avoiding algorithms for solving general dense linear systems on cpu, International Conference on Computational Science Procedia Computer Science, pp.2012-2029, 2012. ,
URL : https://hal.archives-ouvertes.fr/hal-00656457
Enhancing the performance of dense linear algebra solvers on gpus in the magma project, Poster at Supercomputing (SC'08), 2008. ,
Computing the conditioning of the components of a linear least-squares solution, Numerical Linear Algebra with Applications, vol.5, issue.2, pp.517-533, 2009. ,
DOI : 10.1002/nla.627
Accelerating Linear System Solutions Using Randomization Techniques, ACM Transactions on Mathematical Software, vol.39, issue.2, 2012. ,
DOI : 10.1145/2427023.2427025
URL : https://hal.archives-ouvertes.fr/inria-00593306
Some issues in dense linear algebra for multicore and special purpose architectures, 9th International Workshop on State-ofthe-Art in Scientific and Parallel Computing (PARA'08), Lecture Notes in Computer Science, pp.6126-6127, 2008. ,
Using dual techniques to derive componentwise and mixed condition numbers for??a??linear function of??a??linear least squares solution, BIT Numerical Mathematics, vol.26, issue.3, pp.3-19, 2009. ,
DOI : 10.1007/s10543-009-0213-4
Efficient computation of condition estimates for linear least squares problems ,
URL : https://hal.archives-ouvertes.fr/hal-00731136
Floating-point geometry: toward guaranteed geometric computations with approximate arithmetics, Advanced Signal Processing Algorithms, Architectures, and Implementations XVIII, 2008. ,
DOI : 10.1117/12.796597
URL : https://hal.archives-ouvertes.fr/hal-00321291
Reducing the Amount of Pivoting in Symmetric Indefinite Systems, 9th International Conference on Parallel Processing and Applied Mathematics, pp.133-142, 2011. ,
DOI : 10.1007/978-3-642-31464-3_14
URL : https://hal.archives-ouvertes.fr/inria-00593694
Numerical methods for least squares problems, 1996. ,
DOI : 10.1137/1.9781611971484
Installation Guide for LAPACK, LAPACK Working Note 41, 1999. ,
DAGuE: A generic distributed DAG engine for high performance computing, Parallel Computing, 2011. ,
Some stable methods for calculating inertia and solving symmetric linear systems, Mathematics of Computation, vol.31, issue.137, pp.31-163, 1977. ,
DOI : 10.1090/S0025-5718-1977-0428694-0
Direct Methods for Solving Symmetric Indefinite Systems of Linear Equations, SIAM Journal on Numerical Analysis, vol.8, issue.4, pp.639-655, 1971. ,
DOI : 10.1137/0708060
The Impact of Multicore on Math Software, Proceedings of PARA 2006, Workshop on state-of-the art in scientific computing, 2006. ,
DOI : 10.1007/978-3-540-75755-9_1
A class of parallel tiled linear algebra algorithms for multicore architectures, Parallel Computing, vol.35, issue.1, pp.38-53, 2009. ,
DOI : 10.1016/j.parco.2008.10.002
A Subspace Error Estimate for Linear Systems, SIAM Journal on Matrix Analysis and Applications, vol.24, issue.3, pp.787-801, 2003. ,
DOI : 10.1137/S0895479801390649
On the Sensitivity of Solution Components in Linear Systems of Equations, SIAM Journal on Matrix Analysis and Applications, vol.16, issue.1, pp.93-112, 1995. ,
DOI : 10.1137/S0895479892231255
On mixed and componentwise condition numbers for Moore-Penrose inverse and linear least squares problems, Mathematics of Computation, vol.76, issue.258, pp.947-963, 2007. ,
DOI : 10.1090/S0025-5718-06-01913-2
Error bounds from extra-precise iterative refinement, ACM Transactions on Mathematical Software, vol.32, issue.2, pp.325-351, 2006. ,
DOI : 10.1145/1141885.1141894
Extra-Precise Iterative Refinement for Overdetermined Least Squares Problems, ACM Transactions on Mathematical Software, vol.35, issue.4, pp.1-32, 2009. ,
DOI : 10.1145/1462173.1462177
An Asynchronous Parallel Supernodal Algorithm for Sparse Gaussian Elimination, SIAM Journal on Matrix Analysis and Applications, vol.20, issue.4, pp.915-952, 1999. ,
DOI : 10.1137/S0895479897317685
Monte carlo methods for applied scientists, Word Scientific, 2008. ,
Adapting communication-avoiding LU and QR factorizations to multicore architectures, 2010 IEEE International Symposium on Parallel & Distributed Processing (IPDPS), pp.1-10, 2010. ,
DOI : 10.1109/IPDPS.2010.5470348
Numerical linear algebra for high-performance computers, SIAM, 1998. ,
Achieving numerical accuracy and high performance using recursive tile LU factorization, 2011. ,
DOI : 10.1002/cpe.3110
URL : https://hal.archives-ouvertes.fr/hal-00809765
Direct methods for sparse matrices, 1986. ,
Perturbation Theory for the Least Squares Problem with Linear Equality Constraints, SIAM Journal on Numerical Analysis, vol.17, issue.3, pp.338-350, 1980. ,
DOI : 10.1137/0717028
EVE, an Object Oriented SIMD Library, Scalable Computing: Practice and Experience, vol.6, issue.4, pp.31-41, 2005. ,
DOI : 10.1007/978-3-540-24688-6_43
URL : https://hal.archives-ouvertes.fr/hal-00103176
A contribution to the theory of condition, Numerische Mathematik, vol.19, issue.1, pp.85-96, 1982. ,
DOI : 10.1007/BF01399313
Note on the iterative refinement of least squares solution, Numerische Mathematik, vol.5, issue.2, pp.139-148, 1966. ,
DOI : 10.1007/BF02166032
An Analysis of the Total Least Squares Problem, SIAM Journal on Numerical Analysis, vol.17, issue.6, pp.883-893, 1980. ,
DOI : 10.1137/0717073
A numerical evaluation of sparse solvers for symmetric systems, ACM Trans. Math. Softw, vol.33, issue.10, pp.1-1032, 2007. ,
Kronecker products and matrix calculus with application, 1981. ,
On the condition number of linear least squares problems in a weighted Frobenius norm, BIT Numerical Mathematics, vol.13, issue.89, pp.523-530, 1996. ,
DOI : 10.1007/BF01731931
Adjoint formulas for condition numbers applied to linear and indefinite least squares, 2004. ,
CALU: A Communication Optimal LU Factorization Algorithm, SIAM Journal on Matrix Analysis and Applications, vol.32, issue.4, pp.1317-1350, 2011. ,
DOI : 10.1137/100788926
URL : https://hal.archives-ouvertes.fr/hal-00651137
Small-Sample Statistical Estimates for Matrix Norms, SIAM Journal on Matrix Analysis and Applications, vol.16, issue.3, pp.776-792, 1995. ,
DOI : 10.1137/S0895479893243876
FLAME: Formal Linear Algebra Methods Environment, ACM Transactions on Mathematical Software, vol.27, issue.4, pp.422-455, 2001. ,
DOI : 10.1145/504210.504213
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.118.7096
Recursion leads to automatic variable blocking for dense linear-algebra algorithms, IBM Journal of Research and Development, vol.41, issue.6, pp.737-755, 1997. ,
DOI : 10.1147/rd.416.0737
Condition Estimates, SIAM Journal on Scientific and Statistical Computing, vol.5, issue.2, pp.311-316, 1984. ,
DOI : 10.1137/0905023
Finding Structure with Randomness: Probabilistic Algorithms for Constructing Approximate Matrix Decompositions, SIAM Review, vol.53, issue.2, pp.217-288, 2011. ,
DOI : 10.1137/090771806
On Using an Hybrid MPI-Thread Programming for the Implementation of a Parallel Sparse Direct Solver on a Network of SMP Nodes, PPMA'05, pp.1050-1057, 2005. ,
DOI : 10.1007/11752578_127
SLEPc, ACM Transactions on Mathematical Software, vol.31, issue.3, pp.31-351, 2005. ,
DOI : 10.1145/1089014.1089019
A survey of componentwise perturbation theory in numerical linear algebra Mathematics of Computation 1943-1993: A Half Century of, of Proceedings of Symposia in Applied Mathematics, pp.49-77, 1994. ,
The total least squares problem: computational aspects and analysis, 1991. ,
DOI : 10.1137/1.9781611971002
Numerical matrix analysis: Linear systems and least squares, 2009. ,
DOI : 10.1137/1.9780898717686
Small-Sample Statistical Condition Estimates for General Matrix Functions, SIAM Journal on Scientific Computing, vol.15, issue.1, pp.36-61, 1994. ,
DOI : 10.1137/0915003
Statistical Condition Estimation for Linear Least Squares, SIAM Journal on Matrix Analysis and Applications, vol.19, issue.4, pp.906-923, 1998. ,
DOI : 10.1137/S0895479895291935
A Scalable Tridiagonal Solver for GPUs, 2011 International Conference on Parallel Processing, pp.444-453, 2011. ,
DOI : 10.1109/ICPP.2011.41
On the Computation of Correctly Rounded Sums, IEEE Transactions on Computers, vol.61, issue.3, pp.289-298, 2012. ,
DOI : 10.1109/TC.2011.27
URL : https://hal.archives-ouvertes.fr/inria-00475279
Implementing Linear Algebra Routines on Multi-core Processors with Pipelining and a Look Ahead, 2006. ,
DOI : 10.1007/978-3-540-75755-9_18
CADNA_C: A version of CADNA for use with C or C++ programs, Computer Physics Communications, vol.181, issue.11, 1925. ,
DOI : 10.1016/j.cpc.2010.07.006
URL : https://hal.archives-ouvertes.fr/hal-01146511
Accuracy versus time, Proceedings of the 4th International Workshop on Parallel and Symbolic Computation, PASCO '10, pp.120-130, 2010. ,
DOI : 10.1145/1837210.1837229
URL : https://hal.archives-ouvertes.fr/hal-00477511
Exploiting the performance of 32 bit floating point arithmetic in obtaining 64 bit accuracy, Proceedings of the 2006 ACM/IEEE Conference on Supercomputing, 2006. ,
Randomized algorithms for matrices and data, Foundations and Trends in, Machine Learning, vol.3, issue.2, pp.123-224, 2011. ,
Iterative Refinement in Floating Point, Journal of the ACM, vol.14, issue.2, pp.316-321, 1967. ,
DOI : 10.1145/321386.321394
An Improved Magma Gemm For Fermi Graphics Processing Units, The International Journal of High Performance Computing Applications, vol.27, issue.1, pp.511-515, 2010. ,
DOI : 10.1177/1094342010385729
Compatibility of approximate solution of linear equations with given error bounds for coefficients and right-hand sides, Numerische Mathematik, vol.2, issue.1, pp.405-409, 1964. ,
DOI : 10.1007/BF01386090
The ijk forms of factorization methods II. Parallel systems, Parallel Computing, vol.7, issue.2, pp.149-162, 1988. ,
DOI : 10.1016/0167-8191(88)90036-1
Core Problems in Linear Algebraic Systems, SIAM Journal on Matrix Analysis and Applications, vol.27, issue.3, pp.861-875, 2006. ,
DOI : 10.1137/040616991
LSQR: An Algorithm for Sparse Linear Equations and Sparse Least Squares, ACM Transactions on Mathematical Software, vol.8, issue.1, pp.43-71, 1982. ,
DOI : 10.1145/355984.355989
Random butterfly transformations with applications in computational linear algebra, 1995. ,
A Reconstruction Method for the Flow Past an Open Cavity, Journal of Fluids Engineering, vol.128, issue.3, pp.531-540, 2006. ,
DOI : 10.1115/1.2175159
URL : https://hal.archives-ouvertes.fr/hal-00108761
Programming matrix algorithms-by-blocks for thread-level parallelism, ACM Transactions on Mathematical Software, vol.36, issue.3, pp.1-26, 2009. ,
DOI : 10.1145/1527286.1527288
On fast factorization pivoting methods for symmetric indefinite systems, Elec. Trans. Numer. Anal, vol.23, pp.158-179, 2006. ,
Iterative refinement implies numerical stability for Gaussian elimination, Mathematics of Computation, vol.35, issue.151, pp.817-832, 1980. ,
DOI : 10.1090/S0025-5718-1980-0572859-4
Analysis of Pairwise Pivoting in Gaussian Elimination, IEEE Transactions on Computers, vol.34, issue.3, pp.274-278, 1984. ,
DOI : 10.1109/TC.1985.1676570
Introduction to matrix computations, 1973. ,
A dense complex symmetric indefinite solver for the Fujitsu AP3000, 1999. ,
Towards dense linear algebra for hybrid GPU accelerated manycore systems, Parallel Computing, vol.36, issue.5-6, pp.232-240, 2010. ,
DOI : 10.1016/j.parco.2009.12.005
Accelerating the reduction to upper Hessenberg, tridiagonal, and bidiagonal forms through hybrid GPU-based computing, Parallel Computing, vol.36, issue.12, pp.645-654, 2010. ,
DOI : 10.1016/j.parco.2010.06.001
Average-Case Stability of Gaussian Elimination, SIAM Journal on Matrix Analysis and Applications, vol.11, issue.3, pp.335-360, 1990. ,
DOI : 10.1137/0611023
ROUNDING-OFF ERRORS IN MATRIX PROCESSES, The Quarterly Journal of Mechanics and Applied Mathematics, vol.1, issue.1, pp.287-308, 1948. ,
DOI : 10.1093/qjmam/1.1.287
Using GPUs to accelerate linear algebra routines, Poster at PAR lab winter retreat, 2008. ,
Cholesky factorizations using vector capabilities of GPUs, 2008. ,
Perturbation theory for pseudo-inverses, BIT, vol.17, issue.2, pp.217-232, 1973. ,
DOI : 10.1007/BF01933494
Rounding errors in algebraic processes Her Majesty's Stationery Office, 1963. ,
One-sided Dense Matrix Factorizations on a Multicore with Multiple GPU Accelerators*, International Conference on Computational Science Procedia Computer Science, pp.2012-2049, 2012. ,
DOI : 10.1016/j.procs.2012.04.005
QUARK users guide: QUeueing And Runtime for Kernels, 2011. ,
Historical Development of the Newton???Raphson Method, SIAM Review, vol.37, issue.4, pp.531-551, 1995. ,
DOI : 10.1137/1037125
Linear estimation and related topics, Survey of numerical analysis, pp.558-584, 1962. ,