arXiv:2606.05326v1 Announce Type: cross Abstract: We study the dynamics of gradient descent in the Edge of Stability regime, where the learning rate is large enough to induce persistent oscillations in the loss and the sharpness. We propose a continuous-time effective model that tracks the evolution of the average trajectory coupled with the...
Læs hele artiklen hos kilden.
Kommentarer (0)
Ingen kommentarer ennå. Bli den første til å kommentere!