Crypto Ticker:
technology from Arxiv cs.ai

Stability vs. Manipulability: Evaluating Robustness Under Post-Decision Interaction in LLM Judges

Srimonti Dutta, Akshata Kishore Moharir
Jun 5, 2026 at 04:00
8 Views
0 Comments

arXiv:2606.05384v1 Announce Type: new Abstract: LLM-as-judge evaluation is widely used in benchmarking pipelines, where model outputs are compared and ranked using automated evaluators. These pipelines typically assume that judgments are stable properties of fixed inputs. We show that this assumption does not hold under interaction. We study...

Read the full article at the source.

Was this helpful?
Share:

Comments (0)

Please login to post a comment

No comments yet. Be the first to comment!