An extrapolated and provably convergent algorithm for nonlinear matrix decomposition with the ReLU function

arXiv:2503.23832v2 Announce Type: replace-cross
Abstract: ReLU matrix decomposition (RMD) is the following problem: given a sparse, nonnegative matrix $X$ and a factorization rank $r$, identify a rank-$r$ matrix $Theta$ such that $Xapprox max(0,Theta)$. RMD is a particular instance of nonlinear matrix decomposition (NMD) that finds application in data compression, matrix completion with entries missing not at random, and manifold learning. The standard RMD model minimizes the least squares error, that is, $|X – max(0,Theta)|_F^2$. The corresponding optimization problem, Least-Squares RMD (LS-RMD), is nondifferentiable and highly nonconvex. This motivated Saul to propose an alternative model, revise{dubbed Latent-RMD}, where a latent variable $Z$ is introduced and satisfies $max(0,Z)=X$ while minimizing $|Z – Theta|_F^2$ (“A nonlinear matrix decomposition for mining the zeros of sparse data”, SIAM J. Math. Data Sci., 2022). Our first contribution is to show that the two formulations may yield different low-rank solutions $Theta$. We then consider a reparametrization of the Latent-RMD, called 3B-RMD, in which $Theta$ is substituted by a low-rank product $WH$, where $W$ has $r$ columns and $H$ has $r$ rows. Our second contribution is to prove the convergence of a block coordinate descent (BCD) approach applied to 3B-RMD. Our third contribution is a novel extrapolated variant of BCD, dubbed eBCD, which we prove is also convergent under mild assumptions. We illustrate the significant acceleration effect of eBCD compared to eBCD, and also show that eBCD performs well against the state of the art on synthetic and real-world data sets.

Liked Liked