Skip to Main content Skip to Navigation

High Performance Parallel Algorithms for Tensor Decompositions

Oguz Kaya 1, 2
Abstract : Tensor factorization has been increasingly used to analyze high-dimensional low-rank data ofmassive scale in numerous application domains, including recommender systems, graphanalytics, health-care data analysis, signal processing, chemometrics, and many others.In these applications, efficient computation of tensor decompositions is crucial to be able tohandle such datasets of high volume. The main focus of this thesis is on efficient decompositionof high dimensional sparse tensors, with hundreds of millions to billions of nonzero entries,which arise in many emerging big data applications. We achieve this through three majorapproaches.In the first approach, we provide distributed memory parallel algorithms with efficientpoint-to-point communication scheme for reducing the communication cost. These algorithmsare agnostic to the partitioning of tensor elements and low rank decomposition matrices, whichallow us to investigate effective partitioning strategies for minimizing communication cost whileestablishing computational load balance. We use hypergraph-based techniques to analyze computational and communication requirements in these algorithms, and employ hypergraphpartitioning tools to find suitable partitions that provide much better scalability.Second, we investigate effective shared memory parallelizations of these algorithms. Here, we carefully determine unit computational tasks and their dependencies, and express them using aproper data structure that exposes the parallelism underneath.Third, we introduce a tree-based computational scheme that carries out expensive operations(involving the multiplication of the tensor with a set of vectors or matrices, found at the core ofthese algorithms) faster by factoring out and storing common partial results and effectivelyre-using them. With this computational scheme, we asymptotically reduce the number oftensor-vector and -matrix multiplications for high dimensional tensors, and thereby rendercomputing tensor decompositions significantly cheaper both for sequential and parallelalgorithms.Finally, we diversify this main course of research with two extensions on similar themes.The first extension involves applying the tree-based computational framework to computingdense tensor decompositions, with an in-depth analysis of computational complexity andmethods to find optimal tree structures minimizing the computational cost. The second workfocuses on adapting effective communication and partitioning schemes of our parallel sparsetensor decomposition algorithms to the widely used non-negative matrix factorization problem,through which we obtain significantly better parallel scalability over the state of the artimplementations.We point out that all theoretical results in the thesis are nicely corroborated by parallelexperiments on both shared-memory and distributed-memory platforms. With these fastalgorithms as well as their tuned implementations for modern HPC architectures, we rendertensor and matrix decomposition algorithms amenable to use for analyzing massive scaledatasets.
Complete list of metadatas

Cited literature [105 references]  Display  Hide  Download
Contributor : Abes Star :  Contact
Submitted on : Wednesday, October 25, 2017 - 2:15:07 PM
Last modification on : Wednesday, November 20, 2019 - 3:27:44 AM


Version validated by the jury (STAR)


  • HAL Id : tel-01623523, version 2



Oguz Kaya. High Performance Parallel Algorithms for Tensor Decompositions. Distributed, Parallel, and Cluster Computing [cs.DC]. Université de Lyon, 2017. English. ⟨NNT : 2017LYSEN051⟩. ⟨tel-01623523v2⟩



Record views


Files downloads