Kryptovalutaticker:
technology från Arxiv cs.ai

Gradient descent at the Edge of Stability: free energy model and kinetic description of the two-layer network

Antonin Chodron de Courcel
Jun 5, 2026 at 04:00
4 Visningar
0 Kommentarer

arXiv:2606.05326v1 Announce Type: cross Abstract: We study the dynamics of gradient descent in the Edge of Stability regime, where the learning rate is large enough to induce persistent oscillations in the loss and the sharpness. We propose a continuous-time effective model that tracks the evolution of the average trajectory coupled with the...

Läs hela artikeln hos källan.

Var detta hjälpsamt?
Dela:

Kommentarer (0)

Vänligen logga in för att publicera en kommentar

Inga kommentarer ännu. Bli först med att kommentera!