Bayesian R-LayerNorm: Uncertainty-Aware Adaptive Normalization with Provable Robustness Bounds

This paper introduces Bayesian R-LayerNorm, a novel normalization layer that extends the previously proposed R-LayerNorm with formal mathematical foundations and uncertainty quantification. Building upon the empirical success of R-LayerNorm, we present a complete mathematical formalism using sta-tistical field theory, renormalization group methods, and information geometry. Our approach provides provable stability guarantees through three theorems: numerical stability, gradient stability, and training convergence. The Bayesian extension incorporates uncertainty estimation through a stable ψ-function, enabling adaptive noise suppression based on local entropy estimates. A key contribution is the integration of uncertainty quantification directly into the normal-ization operation, providing confidence estimates for each normalized activation without additional cost. The method is adaptive to local noise, varying its normalization strength spatially based on estimated noise levels. Despite its theoretical depth, the implementation is simple and serves as a drop-in replacement for existing normalization layers, adding only two learnable parameters per layer. Experimental validation on the full CIFAR-10-C dataset demonstrates consistent improvements: Bayesian R-LayerNorm achieves average accuracy gains of +0.49% over standard LayerNorm across four common corruptions, with the largest improvement of +0.74% on shot noise. The method requires minimal computational overhead (∼ 10%) and we provide complete open-source implementation. We further show that the learned λ parameters offer interpretability, revealing which layers adapt most strongly to different corruptions. While the accuracy gains are modest, the framework opens new di-rections for trustworthy and interpretable normalization in safety-critical applications where uncertainty matters as much as accuracy.

Liked Liked