view post Post 7226 š¢ New Research Alert: Making Language Models Smaller & Smarter!Thrilled to share the latest technical report demonstrating how to reduce language model parameters by 77% while maintaining performance. The secret? Grouped pointwise convolutions. Yes. We brought a method from computer vision to the transformers arena.š Key Findings:ā¢ 77% parameter reduction.ā¢ Maintained model capabilities.ā¢ Improved generalization.Paper: https://www.researchgate.net/publication/388835829_SAVING_77_OF_THE_PARAMETERS_IN_LARGE_LANGUAGE_MODELS_TECHNICAL_REPORTCode: https://github.com/joaopauloschuler/less-parameters-llm See translation 2 replies Ā· š 18 18 š„ 8 8 š¤Æ 3 3 š 2 2 š§ 1 1 + Reply
ARWKV: Pretrain is not what we need, an RNN-Attention-Based Language Model Born from Transformer Paper ā¢ 2501.15570 ā¢ Published Jan 26 ā¢ 23
HART: Efficient Visual Generation with Hybrid Autoregressive Transformer Paper ā¢ 2410.10812 ā¢ Published Oct 14, 2024 ā¢ 17
Addition is All You Need for Energy-efficient Language Models Paper ā¢ 2410.00907 ā¢ Published Oct 1, 2024 ā¢ 145