Fast Catch-Up, Late Switching: Optimal Batch Size Scheduling via Functional Scaling Laws
arXiv:2602.14208v2 Announce Type: replace-cross Abstract: Batch size scheduling (BSS) plays a critical role in large-scale deep learning training, influencing both optimization dynamics and computational efficiency. Yet, its theoretical foundations remain poorly understood. In...
🔗 Read more: https://arxiv.org/abs/2602.14208
#News #Policy #AI #Academic
Edited
Comments
Log in to leave a comment.
No comments yet. Be the first to comment!