Unstable Features, Reproducible Subspaces: Understanding Seed Dependence in Sparse Autoencoders

Gleb Gerasimov, Timofei Rusalev, Nikita Balagansky, Daniil Laptev, Vadim Kurochkin, Daniil Gavrilov

Thursday at 04:00

3 Visninger

0 Kommentarer

arXiv:2606.12138v1 Announce Type: cross Abstract: Sparse autoencoders (SAEs) are widely used to interpret neural network representations, but their utility depends on whether the learned features are reproducible across training runs. We study this question through \emph{feature stability}: for each SAE feature, we estimate the probability that a...

Læs hele artiklen hos kilden.

Læs original artikel

Var dette nyttigt?

Del:

Kommentarer (0)

Vennligst logg inn for å skrive en kommentar

Ingen kommentarer ennå. Bli den første til å kommentere!

Relaterede nyheder

Lenke kopiert til utklippstavlen

Unstable Features, Reproducible Subspaces: Understanding Seed Dependence in Sparse Autoencoders

Kommentarer (0)

Relaterede nyheder

Chipmaker Nvidia seeks to raise over $25B in first bond deal since 2021

De vann kärnkraftskampen – Viktigast: hålla tid och budget

[Ekstra] Sopra Steria: Hun er ny leder

Beskedet: Vattenfall har valt leverantör för ny kärnkraft vid Ringhals

Social media ban - bold and blunt, but no silver bullet

Gennemse efter kategori