Back to Feed
Optimal Learning-Rate Schedules under Functional Scaling Laws: Power Decay and Warmup-Stable-Decay

arXiv:2602.06797v2 Announce Type: replace-cross Abstract: We study optimal learning-rate schedules (LRSs) under the functional scaling law (FSL) framework introduced in Li et al. (2025), which accurately models the loss dynamics of both linear regression and large language mod...

🔗 Read more: https://arxiv.org/abs/2602.06797

#News #Policy #Software #Energy #AI #Academic
Edited

Comments

No comments yet. Be the first to comment!