Provable test-time adaptivity and distributional robustness of in-context learning
arXiv:2510.23254v2 Announce Type: replace-cross Abstract: We study in-context learning problems where a Transformer is pretrained on tasks drawn from a mixture distribution $\pi=\sum_{\alpha\in\mathcal{A}} \lambda_{\alpha} \pi_{\alpha}$,...