Do We Need Adam? Surprisingly Strong and Sparse Reinforcement Learning with SGD in LLMs
arXiv:2602.07729v2 Announce Type: replace Abstract: Reinforcement learning (RL), particularly RL from verifiable reward (RLVR), has become a crucial phase of training large language models (LLMs)...