Bi-Level Prompt Optimization for Multimodal LLM-as-a-Judge
arXiv:2602.11340v1 Announce Type: new Abstract: Large language models (LLMs) have become widely adopted as automated judges for evaluating AI-generated content. Despite their success, aligning LLM-based...