Why Your Deep Research Agent Fails? On Hallucination Evaluation in Full Research Trajectory
arXiv:2601.22984v1 Announce Type: new Abstract: Diagnosing the failure mechanisms of Deep Research Agents (DRAs) remains a critical challenge. Existing benchmarks predominantly rely on end-to-end evaluation,...