arXiv:2511.19314v2 Announce Type: replace Abstract: Information-seeking is a core capability for AI agents, requiring them to gather and reason over tool-generated information across long trajectories. However, such multi-step information-seeking tasks remain challenging for agents backed by language models. While process reward models (PRMs) can...
Read the full article at the source.
Comments (0)
No comments yet. Be the first to comment!