Hugging Face
Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up
Sasikanth
's Collections
Training Tricks
Training Tricks
updated
Apr 5
Upvote
-
ZClip: Adaptive Spike Mitigation for LLM Pre-Training
Paper
•
2504.02507
•
Published
Apr 3
•
78
Variance Control via Weight Rescaling in LLM Pre-training
Paper
•
2503.17500
•
Published
Mar 21
•
5
Upvote
-
Share collection
View history
Collection guide
Browse collections