Spectral Asymptotics of Neural Network Loss Landscapes: An Exact Decomposition of the Curvature Exponent
Researchers introduce the Spectral Alignment Decomposition to explain how Hessian eigenvalue scaling varies across neural network layer types.
The study defines the curvature exponent alpha, which governs the relationship between Hessian eigenvalues and gradient singular values. By proving the Spectral Alignment Decomposition, the authors show that alpha is determined by the alignment between layer activations and gradients, explaining why values differ between convolutions, attention mechanisms, and MLP projections.