arXiv:2606.06486v1 Announce Type: cross Abstract: In this paper, we study regret minimization in repeated games with \emph{adaptive} opponents who can respond based on histories of play. The standard metric of \emph{external regret} in online learning is known to fail to capture such adaptivity. To account for players' counterfactual reasoning,...
Les hele artikkelen hos kilden.
Kommentarer (0)
Ingen kommentarer ennå. Bli den første til å kommentere!