A Computational Transition for Detecting Multivariate Shuffled Linear Regression by Low-Degree Polynomials

arXiv:2504.03097v2 Announce Type: replace
Abstract: In this paper, we study the problem of multivariate shuffled linear regression, where the correspondence between predictors and responses in a linear model is obfuscated by a latent permutation. Specifically, we investigate the model $Y=tfrac{1}{sqrt{1+sigma^2}}(Pi_* X Q_* + sigma Z)$, where $X$ is an $n*d$ standard Gaussian design matrix, $Z$ is an $n*m$ Gaussian noise matrix, $Pi_*$ is an unknown $n*n$ permutation matrix, and $Q_*$ is an unknown $d*m$ on the Grassmanian manifold satisfying $Q_*^{top} Q_* = mathbb I_m$.
Consider the hypothesis testing problem of distinguishing this model from the case where $X$ and $Y$ are independent Gaussian random matrices of sizes $n*d$ and $n*m$, respectively. Our results reveal a phase transition phenomenon in the performance of low-degree polynomial algorithms for this task. (1) When $m=o(d)$, we show that all degree-$D$ polynomials fail to distinguish these two models even when $sigma=0$, provided with $D^4=obig( tfrac{d}{m} big)$. (2) When $m=d$ and $sigma=omega(1)$, we show that all degree-$D$ polynomials fail to distinguish these two models provided with $D=o(sigma)$. (3) When $m=d$ and $sigma=o(1)$, we show that there exists a constant-degree polynomial that strongly distinguish these two models. These results establish a smooth transition in the effectiveness of low-degree polynomial algorithms for this problem, highlighting the interplay between the dimensions $m$ and $d$, the noise level $sigma$, and the computational complexity of the testing task.

Liked Liked