Kryptovaluta-ticker:
technology fra Arxiv cs.ai

Gradient descent at the Edge of Stability: free energy model and kinetic description of the two-layer network

Antonin Chodron de Courcel
Jun 5, 2026 at 04:00
3 Visninger
0 Kommentarer

arXiv:2606.05326v1 Announce Type: cross Abstract: We study the dynamics of gradient descent in the Edge of Stability regime, where the learning rate is large enough to induce persistent oscillations in the loss and the sharpness. We propose a continuous-time effective model that tracks the evolution of the average trajectory coupled with the...

Les hele artikkelen hos kilden.

Var dette nyttig?
Del:

Kommentarer (0)

Vennligst logg inn for å skrive en kommentar

Ingen kommentarer ennå. Bli den første til å kommentere!