arXiv:2606.02765v1 Announce Type: cross Abstract: Model dimension ($d_{model}$) is a fundamental hyperparameter in transformer language models, yet its role in setting the geometric limits of feature representation remains under-explored. Grounded in the Linear Representation and Superposition Hypotheses - which propose that models encode...
Les hele artikkelen hos kilden.
Kommentarer (0)
Ingen kommentarer ennå. Bli den første til å kommentere!