SPEED-RL: Faster Training of Reasoning Models via Online Curriculum Learning
arXiv:2506.09016v3 Announce Type: replace Abstract: Training large language models with reinforcement learning (RL) against verifiable rewards significantly enhances their reasoning abilities, yet remains computationally expensive...