Kryptovaluta-ticker:
technology fra Arxiv cs.ai

When Context Returns: Toward Robust Internalization in On-Policy Distillation

Xun Wang, Ruishuo Chen, Zhuoran Li, Yu Chen, Longbo Huang
Thursday at 04:00
3 Visninger
0 Kommentarer

arXiv:2606.11627v1 Announce Type: cross Abstract: Recent work has shown that on-policy distillation can internalize privileged context, such as system prompts or task hints, into a student model so that the context is no longer needed at inference time. Although this approach successfully improves the student's no-context performance, we identify...

Læs hele artiklen hos kilden.

Var dette nyttigt?
Del:

Kommentarer (0)

Vennligst logg inn for å skrive en kommentar

Ingen kommentarer ennå. Bli den første til å kommentere!