Large Deviations of Gaussian Neural Networks with ReLU activation

arXiv:2405.16958v3 Announce Type: replace
Abstract: We prove a large deviation principle for deep neural networks with Gaussian weights and at most linearly growing activation functions, such as ReLU. This generalises earlier work, in which bounded and continuous activation functions were considered. In practice, linearly growing activation functions such as ReLU are most commonly used. We furthermore simplify previous expressions for the rate function and provide a power-series expansions for the ReLU case.

Liked Liked