Skip to Main content Skip to Navigation

A distributed Frank-Wolfe framework for trace norm minimization via the bulk synchronous parallel model

Abstract : Learning low-rank matrices is a problem of great importance in statistics, machine learning, computer vision, recommender systems, etc. Because of its NP-hard nature, a principled approach is to solve its tightest convex relaxation : trace norm minimization. Among various algorithms capable of solving this optimization is the Frank-Wolfe method, which is particularly suitable for high-dimensional matrices. In preparation for the usage of distributed infrastructures to further accelerate the computation, this study aims at exploring the possibility of executing the Frank-Wolfe algorithm in a star network with the Bulk Synchronous Parallel (BSP) model and investigating its efficiency both theoretically and empirically. In the theoretical aspect, this study revisits Frank-Wolfe's fundamental deterministic sublinear convergence rate and extends it to nondeterministic cases. In particular, it shows that with the linear subproblem appropriately solved, Frank-Wolfe can achieve a sublinear convergence rate both in expectation and with high probability. This contribution lays the theoretical foundation of using power iteration or Lanczos iteration to solve the linear subproblem for trace norm minimization. In the algorithmic aspect, within the BSP model, this study proposes and analyzes four strategies for the linear subproblem as well as methods for the line search. Moreover, noticing Frank-Wolfe's rank-1 update property, it updates the gradient recursively, with either a dense or a low-rank representation, instead of repeatedly recalculating it from scratch. All of these designs are generic and apply to any distributed infrastructures compatible with the BSP model. In the empirical aspect, this study tests the proposed algorithmic designs in an Apache SPARK cluster. According to the experiment results, for the linear subproblem, centralizing the gradient or averaging the singular vectors is sufficient in the low-dimensional case, whereas distributed power iteration, with as few as one or two iterations per epoch, excels in the high-dimensional case. The Python package developed for the experiments is modular, extensible and ready to deploy in an industrial context. This study has achieved its function as proof of concept. Following the path it sets up, solvers can be implemented for various infrastructures, among which GPU clusters, to solve practical problems in specific contexts. Besides, its excellent performance in the ImageNet dataset makes it promising for deep learning.
Complete list of metadatas

Cited literature [153 references]  Display  Hide  Download
Contributor : Abes Star :  Contact
Submitted on : Monday, May 20, 2019 - 12:59:06 PM
Last modification on : Friday, October 23, 2020 - 5:04:20 PM


Version validated by the jury (STAR)


  • HAL Id : tel-02134166, version 1


Wenjie Zheng. A distributed Frank-Wolfe framework for trace norm minimization via the bulk synchronous parallel model. Distributed, Parallel, and Cluster Computing [cs.DC]. Sorbonne Université, 2018. English. ⟨NNT : 2018SORUS049⟩. ⟨tel-02134166⟩



Record views


Files downloads