-
Ultra-Sparse Memory Network
Paper • 2411.12364 • Published • 24 -
Hyper-Connections
Paper • 2409.19606 • Published • 23 -
Polynomial Composition Activations: Unleashing the Dynamics of Large Language Models
Paper • 2411.03884 • Published • 29 -
Over-Tokenized Transformer: Vocabulary is Generally Worth Scaling
Paper • 2501.16975 • Published • 31
Open-Foundation-Models
non-profit
AI & ML interests
None defined yet.
Recent Activity
Collections
1
models
2
datasets
0
None public yet