AlbertShi's picture

2 7

AlbertShi

AlbertShi

·

AI & ML interests

None yet

Recent Activity

upvoted a paper 12 days ago

Wonderful Matrices: Combining for a More Efficient and Effective Foundation Model Architecture

liked a model 12 days ago

SmallDoge/Doge-20M-checkpoint

liked a model 12 days ago

SmallDoge/Doge-60M-checkpoint

View all activity

Organizations

AlbertShi's activity

upvoted a paper 12 days ago

Wonderful Matrices: Combining for a More Efficient and Effective Foundation Model Architecture

Paper • 2412.11834 • Published Dec 16, 2024 • 7

liked 4 models 12 days ago

SmallDoge/Doge-20M-checkpoint

Text Generation • Updated 7 days ago • 13.7k • 4

SmallDoge/Doge-60M-checkpoint

Text Generation • Updated 7 days ago • 13.8k • 3

SmallDoge/Doge-160M-checkpoint

Text Generation • Updated 2 days ago • 13.8k • 3

SmallDoge/Doge-20M-Instruct

Question Answering • Updated 7 days ago • 52.6k • 3

upvoted a collection 12 days ago

Doge

Doge family of small language models. • 7 items • Updated 9 days ago • 5

liked 3 models 12 days ago

SmallDoge/Doge-60M-Instruct

Question Answering • Updated 7 days ago • 55k • 5

SmallDoge/Doge-20M

Text Generation • Updated 7 days ago • 68.3k • 5

SmallDoge/Doge-60M

Text Generation • Updated 7 days ago • 71.7k • 4

reacted to JingzeShi's post with 👍🤯👀 12 days ago

Post

2075

Only a single RTX 4090 running model pre-training is really slow, even for small language models!!! (https://huggingface.co/collections/JingzeShi/doge-slm-677fd879f8c4fd0f43e05458)

2 replies

·

reacted to JingzeShi's post with 🔥 12 days ago

Post

1667

🤩warmup -> stable -> decay leanring rate scheduler:
😎use the Stable Phase CheckPoints to Continue Training the model on Any New Dataset without spikes of the training!!!
SmallDoge/Doge-20M-checkpoint
SmallDoge/Doge-60M-checkpoint

4 replies

·