Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
10086
14
222
Tien Dung
tiendung
Follow
khanhtx8x's profile picture
21world's profile picture
vinhnx90's profile picture
13 followers
·
114 following
tiendung
AI & ML interests
None yet
Recent Activity
liked
a model
17 days ago
SparseLLM/BlockFFN-3B-SFT
liked
a model
25 days ago
turboderp/ERNIE-4.5-300B-A47B-PT-exl3
reacted
to
Jaward
's
post
with 😎
29 days ago
I played around with the new RXTX paper (XX^T) and was able to train nanogpt with 4x4 RXTX matmuls in both attention layer and optimizer🤕 It just works (well I had to add some guardrails) but still saves 5% of memory usage: The Patch: - Computes attention scores with a 4x4 blockwise RXTX matmuls (no pytorch dot prod) - Handles arbitrary sequence lengths by padding to the nearest multiple of 4. - An RXTX variant of shampoo with params reshaped into 4x4 blocks during each optimizer step. - Uses 5% less ops Code: https://github.com/Jaykef/ai-algorithms/blob/main/nanogpt-rxtx.ipynb Paper: https://arxiv.org/pdf/2505.09814
View all activity
Organizations
tiendung
's datasets
3
Sort: Recently updated
tiendung/cc-vi_truyen-filters
Preview
•
Updated
Oct 3, 2023
•
4
tiendung/cc-vi_domains
Preview
•
Updated
Sep 21, 2023
•
4
tiendung/chai
Viewer
•
Updated
Sep 15, 2023
•
70.8k
•
9