Distribution-Calibrated Inference Time Compute for Thinking LLM-as-a-Judge

Hamid Dadkhahi, Firas Trabelsi, Parker Riley, Juraj Juraska, Mehdi Mirzazadeh

Jun 3, 2026 at 04:00

12 Visningar

0 Kommentarer

arXiv:2512.03019v2 Announce Type: replace-cross Abstract: Thinking Large Language Models (LLMs) used as judges for pairwise preferences remain noisy at the single-sample level, and common aggregation rules (majority vote, soft self-consistency, or instruction-based self-aggregation) are inconsistent when ties are allowed. We study inference-time...

Läs hela artikeln hos källan.

Läs originalartikeln

Var detta hjälpsamt?

Dela:

Kommentarer (0)

Vänligen logga in för att publicera en kommentar

Inga kommentarer ännu. Bli först med att kommentera!

Relaterade nyheter

Cryptee Launches End-to-End Encrypted Photo Sharing: Legal Risks, Preventing Abuse, and Their Solution

7 hours ago

Länk kopierad till urklipp

Distribution-Calibrated Inference Time Compute for Thinking LLM-as-a-Judge

Kommentarer (0)

Relaterade nyheter

Cryptee Launches End-to-End Encrypted Photo Sharing: Legal Risks, Preventing Abuse, and Their Solution

Ballonger, Eiffeltornet och tidkulan – klarar du gåtorna?

The Trouble with Cancer Screening in Healthy Adults

Forskningshemligheter sägs ha stulits från Novo Nordisk i hackerattack

Epics omgjorda launcher blir fem gånger snabbare

Bläddra efter kategori