Fused Ion 's picture

3 4 16

Fused Ion

fusedion

·

AI & ML interests

None yet

Recent Activity

liked a dataset 14 days ago

ibm-granite/GneissWeb

liked a model 16 days ago

DevQuasar/NovaSky-AI.Sky-T1-32B-Flash-GGUF

reacted to schuler's post with 👍 27 days ago

📢 New Research Alert: Making Language Models Smaller & Smarter! Thrilled to share the latest technical report demonstrating how to reduce language model parameters by 77% while maintaining performance. The secret? Grouped pointwise convolutions. Yes. We brought a method from computer vision to the transformers arena. 🔑 Key Findings: • 77% parameter reduction. • Maintained model capabilities. • Improved generalization. Paper: https://www.researchgate.net/publication/388835829_SAVING_77_OF_THE_PARAMETERS_IN_LARGE_LANGUAGE_MODELS_TECHNICAL_REPORT Code: https://github.com/joaopauloschuler/less-parameters-llm

View all activity

Organizations

None yet

fusedion's activity

liked a dataset 14 days ago

ibm-granite/GneissWeb

Updated 9 days ago • 3.42k • 26

liked a model 16 days ago

DevQuasar/NovaSky-AI.Sky-T1-32B-Flash-GGUF

Text Generation • Updated 17 days ago • 615 • 1

reacted to schuler's post with 👍🔥 27 days ago

Post

7226

📢 New Research Alert: Making Language Models Smaller & Smarter!

Thrilled to share the latest technical report demonstrating how to reduce language model parameters by 77% while maintaining performance.

The secret? Grouped pointwise convolutions. Yes. We brought a method from computer vision to the transformers arena.

🔑 Key Findings:
• 77% parameter reduction.
• Maintained model capabilities.
• Improved generalization.

Paper: https://www.researchgate.net/publication/388835829_SAVING_77_OF_THE_PARAMETERS_IN_LARGE_LANGUAGE_MODELS_TECHNICAL_REPORT
Code: https://github.com/joaopauloschuler/less-parameters-llm

2 replies

·

New activity in DAMO-NLP-SG/VideoLLaMA3-7B about 1 month ago

Max video length

#4 opened about 1 month ago by

upvoted a paper about 1 month ago

ARWKV: Pretrain is not what we need, an RNN-Attention-Based Language Model Born from Transformer

Paper • 2501.15570 • Published Jan 26 • 23

liked a model about 1 month ago

HKUSTAudio/Llasa-3B

Text-to-Speech • Updated about 19 hours ago • 3.66k • 469

replied to takeraparterer's post about 1 month ago

This comment has been hidden

New activity in ginipick/Dokdo-multimodal 2 months ago

Model details

#2 opened 2 months ago by

liked a model 3 months ago

ChenDY/NitroFusion

Text-to-Image • Updated Jan 6 • 944 • 95

liked a dataset 4 months ago

GAIR/o1-journey

Viewer • Updated Oct 16, 2024 • 327 • 291 • 134

upvoted a paper 5 months ago

HART: Efficient Visual Generation with Hybrid Autoregressive Transformer

Paper • 2410.10812 • Published Oct 14, 2024 • 17

liked a dataset 5 months ago

isaiahbjork/cot-logic-reasoning

Viewer • Updated Sep 6, 2024 • 10.5k • 224 • 10

liked a Space 5 months ago

Chat-with-OpenAI-o1

Generate text in a chatbot style using OpenAI's model

upvoted a paper 5 months ago

Addition is All You Need for Energy-efficient Language Models

Paper • 2410.00907 • Published Oct 1, 2024 • 145

liked 3 models 6 months ago

Exthalpy/state-0

Text Generation • Updated Sep 20, 2024 • 12

facebook/multi-token-prediction

Updated Jun 18, 2024 • 364

InstantX/FLUX.1-dev-Controlnet-Union

Updated Aug 26, 2024 • 25k • 407

liked a dataset 6 months ago

SkunkworksAI/reasoning-0.01

Viewer • Updated Sep 14, 2024 • 29.9k • 3.48k • 275

liked a model 6 months ago

upstage/solar-pro-preview-instruct

Text Generation • Updated Sep 20, 2024 • 9.47k • 445