A Unified Matrix-Spectral Framework for Stability and Interpretability in Deep Learning

We develop a unified matrix-spectral framework for analyzing stability and interpretability in deep neural networks. Representing networks as data-dependent products of linear operators reveals spectral quantities governing sensitivity to input perturbations, label noise, and training dynamics.
We introduce a Global Matrix Stability Index that aggregates spectral information from Jacobians, parameter gradients, Neural Tangent Kernel operators, and loss Hessians into a single stability scale controlling forward sensitivity, attribution robustness, and optimization conditioning. We further show that spectral entropy refines classical operator-norm bounds by capturing typical, rather than purely worst-case, sensitivity.
These quantities yield computable diagnostics and stability-oriented regularization principles. Synthetic experiments and controlled studies on MNIST, CIFAR-10, and CIFAR-100 confirm that modest spectral regularization substantially improves attribution stability even when global spectral summaries change little.
The results establish a precise connection between spectral concentration and analytic stability, providing practical guidance for robustness-aware model design and training.

Liked Liked