M. Anderson, G. Ballard, J. Demmel, and K. Keutzer, Communication-Avoiding QR Decomposition for GPUs, 2011 IEEE International Parallel & Distributed Processing Symposium, 2011.
DOI : 10.1109/IPDPS.2011.15

M. Arioli, M. Baboulin, and S. Gratton, A Partial Condition Number for Linear Least Squares Problems, SIAM Journal on Matrix Analysis and Applications, vol.29, issue.2, pp.413-433, 2007.
DOI : 10.1137/050643088

M. Arioli, J. W. Demmel, and I. S. Duff, Solving Sparse Linear Systems with Sparse Backward Error, SIAM Journal on Matrix Analysis and Applications, vol.10, issue.2, pp.165-190, 1989.
DOI : 10.1137/0610013

C. Ashcraft, R. G. Grimes, and J. G. Lewis, Accurate Symmetric Indefinite Linear Equation Solvers, SIAM Journal on Matrix Analysis and Applications, vol.20, issue.2, pp.513-561, 1998.
DOI : 10.1137/S0895479896296921

A. Avron, P. Maymounkov, and S. Toledo, Blendenpik: Supercharging LAPACK's Least-Squares Solver, SIAM Journal on Scientific Computing, vol.32, issue.3, pp.1217-1236, 2010.
DOI : 10.1137/090767911

M. Baboulin, D. Becker, and J. Dongarra, A Parallel Tiled Solver for Dense Symmetric Indefinite Systems on Multicore Architectures, 2012 IEEE 26th International Parallel and Distributed Processing Symposium, pp.2012-2026, 2012.
DOI : 10.1109/IPDPS.2012.12

URL : https://hal.archives-ouvertes.fr/inria-00631361

M. Baboulin, A. Buttari, J. Dongarra, J. Kurzak, J. Langou et al., Accelerating scientific computations with mixed precision algorithms, Computer Physics Communications, vol.180, issue.12, pp.2526-2533, 2009.
DOI : 10.1016/j.cpc.2008.11.005

M. Baboulin, S. Donfack, J. Dongarra, L. Grigori, A. Rémy et al., A class of communication-avoiding algorithms for solving general dense linear systems on cpu, International Conference on Computational Science Procedia Computer Science, pp.2012-2029, 2012.
URL : https://hal.archives-ouvertes.fr/hal-00656457

M. Baboulin, J. Dongarra, J. Demmel, S. Tomov, and V. Volkov, Enhancing the performance of dense linear algebra solvers on gpus in the magma project, Poster at Supercomputing (SC'08), 2008.

M. Baboulin, J. Dongarra, S. Gratton, and J. Langou, Computing the conditioning of the components of a linear least-squares solution, Numerical Linear Algebra with Applications, vol.5, issue.2, pp.517-533, 2009.
DOI : 10.1002/nla.627

M. Baboulin, J. Dongarra, J. Herrmann, and S. Tomov, Accelerating Linear System Solutions Using Randomization Techniques, ACM Transactions on Mathematical Software, vol.39, issue.2, 2012.
DOI : 10.1145/2427023.2427025

URL : https://hal.archives-ouvertes.fr/inria-00593306

M. Baboulin, J. Dongarra, and S. Tomov, Some issues in dense linear algebra for multicore and special purpose architectures, 9th International Workshop on State-ofthe-Art in Scientific and Parallel Computing (PARA'08), Lecture Notes in Computer Science, pp.6126-6127, 2008.

M. Baboulin and S. Gratton, Using dual techniques to derive componentwise and mixed condition numbers for??a??linear function of??a??linear least squares solution, BIT Numerical Mathematics, vol.26, issue.3, pp.3-19, 2009.
DOI : 10.1007/s10543-009-0213-4

M. Baboulin, S. Gratton, R. Lacroix, and A. J. Laub, Efficient computation of condition estimates for linear least squares problems
URL : https://hal.archives-ouvertes.fr/hal-00731136

J. Bajard, P. Langlois, D. Michelucci, G. Morin, and N. , Floating-point geometry: toward guaranteed geometric computations with approximate arithmetics, Advanced Signal Processing Algorithms, Architectures, and Implementations XVIII, 2008.
DOI : 10.1117/12.796597

URL : https://hal.archives-ouvertes.fr/hal-00321291

D. Becker, M. Baboulin, and J. Dongarra, Reducing the Amount of Pivoting in Symmetric Indefinite Systems, 9th International Conference on Parallel Processing and Applied Mathematics, pp.133-142, 2011.
DOI : 10.1007/978-3-642-31464-3_14

URL : https://hal.archives-ouvertes.fr/inria-00593694

?. A. Björck, Numerical methods for least squares problems, 1996.
DOI : 10.1137/1.9781611971484

S. Blackford and J. Dongarra, Installation Guide for LAPACK, LAPACK Working Note 41, 1999.

G. Bosilca, A. Bouteiller, A. Danalis, T. Herault, P. Lemarinier et al., DAGuE: A generic distributed DAG engine for high performance computing, Parallel Computing, 2011.

J. Bunch and L. Kaufman, Some stable methods for calculating inertia and solving symmetric linear systems, Mathematics of Computation, vol.31, issue.137, pp.31-163, 1977.
DOI : 10.1090/S0025-5718-1977-0428694-0

J. Bunch and B. N. Parlett, Direct Methods for Solving Symmetric Indefinite Systems of Linear Equations, SIAM Journal on Numerical Analysis, vol.8, issue.4, pp.639-655, 1971.
DOI : 10.1137/0708060

A. Buttari, J. Dongarra, J. Kurzak, J. Langou, P. Luszczek et al., The Impact of Multicore on Math Software, Proceedings of PARA 2006, Workshop on state-of-the art in scientific computing, 2006.
DOI : 10.1007/978-3-540-75755-9_1

A. Buttari, J. Langou, J. Kurzak, and J. Dongarra, A class of parallel tiled linear algebra algorithms for multicore architectures, Parallel Computing, vol.35, issue.1, pp.38-53, 2009.
DOI : 10.1016/j.parco.2008.10.002

Y. Cao and L. Petzold, A Subspace Error Estimate for Linear Systems, SIAM Journal on Matrix Analysis and Applications, vol.24, issue.3, pp.787-801, 2003.
DOI : 10.1137/S0895479801390649

S. Chandrasekaran and I. C. Ipsen, On the Sensitivity of Solution Components in Linear Systems of Equations, SIAM Journal on Matrix Analysis and Applications, vol.16, issue.1, pp.93-112, 1995.
DOI : 10.1137/S0895479892231255

F. Cucker, H. Diao, and Y. Wei, On mixed and componentwise condition numbers for Moore-Penrose inverse and linear least squares problems, Mathematics of Computation, vol.76, issue.258, pp.947-963, 2007.
DOI : 10.1090/S0025-5718-06-01913-2

J. Demmel, Y. Hida, W. Kahan, X. S. Li, S. Mukherjee et al., Error bounds from extra-precise iterative refinement, ACM Transactions on Mathematical Software, vol.32, issue.2, pp.325-351, 2006.
DOI : 10.1145/1141885.1141894

J. Demmel, Y. Hida, X. S. Li, and E. J. Riedy, Extra-Precise Iterative Refinement for Overdetermined Least Squares Problems, ACM Transactions on Mathematical Software, vol.35, issue.4, pp.1-32, 2009.
DOI : 10.1145/1462173.1462177

J. W. Demmel, J. R. Gilbert, and X. S. Li, An Asynchronous Parallel Supernodal Algorithm for Sparse Gaussian Elimination, SIAM Journal on Matrix Analysis and Applications, vol.20, issue.4, pp.915-952, 1999.
DOI : 10.1137/S0895479897317685

I. Dimov, Monte carlo methods for applied scientists, Word Scientific, 2008.

S. Donfack, L. Grigori, and A. K. Gupta, Adapting communication-avoiding LU and QR factorizations to multicore architectures, 2010 IEEE International Symposium on Parallel & Distributed Processing (IPDPS), pp.1-10, 2010.
DOI : 10.1109/IPDPS.2010.5470348

J. Dongarra, I. Duff, D. Sorensen, H. Van, and . Vorst, Numerical linear algebra for high-performance computers, SIAM, 1998.

J. Dongarra, M. Faverge, H. Ltaief, and P. Luszcsek, Achieving numerical accuracy and high performance using recursive tile LU factorization, 2011.
DOI : 10.1002/cpe.3110

URL : https://hal.archives-ouvertes.fr/hal-00809765

I. S. Duff, A. M. Erisman, and J. K. Reid, Direct methods for sparse matrices, 1986.

L. Eldén, Perturbation Theory for the Least Squares Problem with Linear Equality Constraints, SIAM Journal on Numerical Analysis, vol.17, issue.3, pp.338-350, 1980.
DOI : 10.1137/0717028

J. Falcou, J. Sérot, and E. V. , EVE, an Object Oriented SIMD Library, Scalable Computing: Practice and Experience, vol.6, issue.4, pp.31-41, 2005.
DOI : 10.1007/978-3-540-24688-6_43

URL : https://hal.archives-ouvertes.fr/hal-00103176

A. J. Geurts, A contribution to the theory of condition, Numerische Mathematik, vol.19, issue.1, pp.85-96, 1982.
DOI : 10.1007/BF01399313

G. Golub and J. Wilkinson, Note on the iterative refinement of least squares solution, Numerische Mathematik, vol.5, issue.2, pp.139-148, 1966.
DOI : 10.1007/BF02166032

G. H. Golub and C. F. Van-loan, An Analysis of the Total Least Squares Problem, SIAM Journal on Numerical Analysis, vol.17, issue.6, pp.883-893, 1980.
DOI : 10.1137/0717073

N. I. Gould, J. A. Scott, and Y. Hu, A numerical evaluation of sparse solvers for symmetric systems, ACM Trans. Math. Softw, vol.33, issue.10, pp.1-1032, 2007.

A. Graham, Kronecker products and matrix calculus with application, 1981.

S. Gratton, On the condition number of linear least squares problems in a weighted Frobenius norm, BIT Numerical Mathematics, vol.13, issue.89, pp.523-530, 1996.
DOI : 10.1007/BF01731931

J. F. Grcar, Adjoint formulas for condition numbers applied to linear and indefinite least squares, 2004.

L. Grigori, J. Demmel, and H. Xiang, CALU: A Communication Optimal LU Factorization Algorithm, SIAM Journal on Matrix Analysis and Applications, vol.32, issue.4, pp.1317-1350, 2011.
DOI : 10.1137/100788926

URL : https://hal.archives-ouvertes.fr/hal-00651137

T. Gudmundsson, C. S. Kenney, and A. J. Laub, Small-Sample Statistical Estimates for Matrix Norms, SIAM Journal on Matrix Analysis and Applications, vol.16, issue.3, pp.776-792, 1995.
DOI : 10.1137/S0895479893243876

J. A. Gunnels, F. G. Gustavson, G. M. Henry, and R. A. Van-de-geijn, FLAME: Formal Linear Algebra Methods Environment, ACM Transactions on Mathematical Software, vol.27, issue.4, pp.422-455, 2001.
DOI : 10.1145/504210.504213

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.118.7096

F. G. Gustavson, Recursion leads to automatic variable blocking for dense linear-algebra algorithms, IBM Journal of Research and Development, vol.41, issue.6, pp.737-755, 1997.
DOI : 10.1147/rd.416.0737

W. W. Hager, Condition Estimates, SIAM Journal on Scientific and Statistical Computing, vol.5, issue.2, pp.311-316, 1984.
DOI : 10.1137/0905023

]. N. Halko, P. G. Martinsson, and J. A. Tropp, Finding Structure with Randomness: Probabilistic Algorithms for Constructing Approximate Matrix Decompositions, SIAM Review, vol.53, issue.2, pp.217-288, 2011.
DOI : 10.1137/090771806

P. Hénon, P. Ramet, and J. Roman, On Using an Hybrid MPI-Thread Programming for the Implementation of a Parallel Sparse Direct Solver on a Network of SMP Nodes, PPMA'05, pp.1050-1057, 2005.
DOI : 10.1007/11752578_127

V. Hernandez, J. E. Roman, and V. Vidal, SLEPc, ACM Transactions on Mathematical Software, vol.31, issue.3, pp.31-351, 2005.
DOI : 10.1145/1089014.1089019

N. J. Higham, A survey of componentwise perturbation theory in numerical linear algebra Mathematics of Computation 1943-1993: A Half Century of, of Proceedings of Symposia in Applied Mathematics, pp.49-77, 1994.

S. Van-huffel and J. Vandewalle, The total least squares problem: computational aspects and analysis, 1991.
DOI : 10.1137/1.9781611971002

I. C. Ipsen, Numerical matrix analysis: Linear systems and least squares, 2009.
DOI : 10.1137/1.9780898717686

C. S. Kenney and A. J. Laub, Small-Sample Statistical Condition Estimates for General Matrix Functions, SIAM Journal on Scientific Computing, vol.15, issue.1, pp.36-61, 1994.
DOI : 10.1137/0915003

C. S. Kenney, A. J. Laub, and M. S. Reese, Statistical Condition Estimation for Linear Least Squares, SIAM Journal on Matrix Analysis and Applications, vol.19, issue.4, pp.906-923, 1998.
DOI : 10.1137/S0895479895291935

H. Kim, S. Wu, L. Chang, and W. Hwu, A Scalable Tridiagonal Solver for GPUs, 2011 International Conference on Parallel Processing, pp.444-453, 2011.
DOI : 10.1109/ICPP.2011.41

P. Kornerup, V. Lefèvre, N. Louvet, and J. Muller, On the Computation of Correctly Rounded Sums, IEEE Transactions on Computers, vol.61, issue.3, pp.289-298, 2012.
DOI : 10.1109/TC.2011.27

URL : https://hal.archives-ouvertes.fr/inria-00475279

J. Kurzak and J. Dongarra, Implementing Linear Algebra Routines on Multi-core Processors with Pipelining and a Look Ahead, 2006.
DOI : 10.1007/978-3-540-75755-9_18

J. Lamotte, J. Chesneaux, and F. Jézéquel, CADNA_C: A version of CADNA for use with C or C++ programs, Computer Physics Communications, vol.181, issue.11, 1925.
DOI : 10.1016/j.cpc.2010.07.006

URL : https://hal.archives-ouvertes.fr/hal-01146511

P. Langlois, M. Martel, and L. Thévenoux, Accuracy versus time, Proceedings of the 4th International Workshop on Parallel and Symbolic Computation, PASCO '10, pp.120-130, 2010.
DOI : 10.1145/1837210.1837229

URL : https://hal.archives-ouvertes.fr/hal-00477511

J. Langou, J. Langou, P. Luszczek, J. Kurzak, A. Buttari et al., Exploiting the performance of 32 bit floating point arithmetic in obtaining 64 bit accuracy, Proceedings of the 2006 ACM/IEEE Conference on Supercomputing, 2006.

M. W. Mahoney, Randomized algorithms for matrices and data, Foundations and Trends in, Machine Learning, vol.3, issue.2, pp.123-224, 2011.

C. B. Moler, Iterative Refinement in Floating Point, Journal of the ACM, vol.14, issue.2, pp.316-321, 1967.
DOI : 10.1145/321386.321394

R. Nath, S. Tomov, and J. Dongarra, An Improved Magma Gemm For Fermi Graphics Processing Units, The International Journal of High Performance Computing Applications, vol.27, issue.1, pp.511-515, 2010.
DOI : 10.1177/1094342010385729

W. Oettli and W. Prager, Compatibility of approximate solution of linear equations with given error bounds for coefficients and right-hand sides, Numerische Mathematik, vol.2, issue.1, pp.405-409, 1964.
DOI : 10.1007/BF01386090

J. M. Ortega and C. H. Romine, The ijk forms of factorization methods II. Parallel systems, Parallel Computing, vol.7, issue.2, pp.149-162, 1988.
DOI : 10.1016/0167-8191(88)90036-1

C. Paige and Z. , Core Problems in Linear Algebraic Systems, SIAM Journal on Matrix Analysis and Applications, vol.27, issue.3, pp.861-875, 2006.
DOI : 10.1137/040616991

C. C. Paige and M. A. Saunders, LSQR: An Algorithm for Sparse Linear Equations and Sparse Least Squares, ACM Transactions on Mathematical Software, vol.8, issue.1, pp.43-71, 1982.
DOI : 10.1145/355984.355989

D. S. Parker, Random butterfly transformations with applications in computational linear algebra, 1995.

B. Podvin, Y. Fraigneau, F. Lusseyran, and P. Gougat, A Reconstruction Method for the Flow Past an Open Cavity, Journal of Fluids Engineering, vol.128, issue.3, pp.531-540, 2006.
DOI : 10.1115/1.2175159

URL : https://hal.archives-ouvertes.fr/hal-00108761

G. Quintana-orti, E. S. Quintana-orti, R. A. Van-de-geijn, F. G. Van-zee, and E. Chan, Programming matrix algorithms-by-blocks for thread-level parallelism, ACM Transactions on Mathematical Software, vol.36, issue.3, pp.1-26, 2009.
DOI : 10.1145/1527286.1527288

O. Schenk and K. Gärtner, On fast factorization pivoting methods for symmetric indefinite systems, Elec. Trans. Numer. Anal, vol.23, pp.158-179, 2006.

R. D. Skeel, Iterative refinement implies numerical stability for Gaussian elimination, Mathematics of Computation, vol.35, issue.151, pp.817-832, 1980.
DOI : 10.1090/S0025-5718-1980-0572859-4

D. C. Sorensen, Analysis of Pairwise Pivoting in Gaussian Elimination, IEEE Transactions on Computers, vol.34, issue.3, pp.274-278, 1984.
DOI : 10.1109/TC.1985.1676570

G. W. Stewart, Introduction to matrix computations, 1973.

P. E. Strazdins, A dense complex symmetric indefinite solver for the Fujitsu AP3000, 1999.

S. Tomov, J. Dongarra, and M. Baboulin, Towards dense linear algebra for hybrid GPU accelerated manycore systems, Parallel Computing, vol.36, issue.5-6, pp.232-240, 2010.
DOI : 10.1016/j.parco.2009.12.005

S. Tomov, R. Nath, and J. Dongarra, Accelerating the reduction to upper Hessenberg, tridiagonal, and bidiagonal forms through hybrid GPU-based computing, Parallel Computing, vol.36, issue.12, pp.645-654, 2010.
DOI : 10.1016/j.parco.2010.06.001

L. N. Trefethen and R. S. Schreiber, Average-Case Stability of Gaussian Elimination, SIAM Journal on Matrix Analysis and Applications, vol.11, issue.3, pp.335-360, 1990.
DOI : 10.1137/0611023

A. Turing, ROUNDING-OFF ERRORS IN MATRIX PROCESSES, The Quarterly Journal of Mechanics and Applied Mathematics, vol.1, issue.1, pp.287-308, 1948.
DOI : 10.1093/qjmam/1.1.287

V. Volkov and J. Demmel, Using GPUs to accelerate linear algebra routines, Poster at PAR lab winter retreat, 2008.

V. Volkov, J. W. Demmel, and L. , Cholesky factorizations using vector capabilities of GPUs, 2008.

P. Wedin, Perturbation theory for pseudo-inverses, BIT, vol.17, issue.2, pp.217-232, 1973.
DOI : 10.1007/BF01933494

J. H. Wilkinson, Rounding errors in algebraic processes Her Majesty's Stationery Office, 1963.

I. Yamazaki, S. Tomov, and J. Dongarra, One-sided Dense Matrix Factorizations on a Multicore with Multiple GPU Accelerators*, International Conference on Computational Science Procedia Computer Science, pp.2012-2049, 2012.
DOI : 10.1016/j.procs.2012.04.005

A. Yarkhan, J. Kurzak, and J. Dongarra, QUARK users guide: QUeueing And Runtime for Kernels, 2011.

T. J. Ypma, Historical Development of the Newton???Raphson Method, SIAM Review, vol.37, issue.4, pp.531-551, 1995.
DOI : 10.1137/1037125

M. Zelen, Linear estimation and related topics, Survey of numerical analysis, pp.558-584, 1962.