Optimal Learning-Rate Schedules under Functional Scaling Laws: Power Decay and Warmup-Stable-Decay
arXiv:2602.06797v2 Announce Type: replace-cross Abstract: We study optimal learning-rate schedules (LRSs) under the functional scaling law (FSL) framework introduced in Li et al. (2025), which accurately models the loss dynamics of both linear regression and large language mod...
🔗 Read more: https://arxiv.org/abs/2602.06797
#News #Policy #Software #Energy #AI #Academic
Edited
Comments
Log in to leave a comment.
No comments yet. Be the first to comment!