13 2 17

Peter Kis

NePe

AI & ML interests

None yet

Recent Activity

updated a model about 15 hours ago

NePe/Qwen3-30B-A3B-GPTQ

published a model about 15 hours ago

NePe/Qwen3-30B-A3B-GPTQ

new activity 1 day ago

JunHowie/Qwen3-30B-A3B-GPTQ-Int4:Slow GPTQ inference

View all activity

Organizations

NePe's activity

updated a model about 15 hours ago

NePe/Qwen3-30B-A3B-GPTQ

Text Generation • Updated about 15 hours ago • 1

published a model about 15 hours ago

NePe/Qwen3-30B-A3B-GPTQ

Text Generation • Updated about 15 hours ago • 1

New activity in JunHowie/Qwen3-30B-A3B-GPTQ-Int4 1 day ago

Slow GPTQ inference

#2 opened 2 days ago by

NePe

New activity in AlphaGaO/Qwen3-30B-A3B-GPTQ 1 day ago

Slow GPTQ inference

#1 opened 2 days ago by

NePe

liked 2 models 2 days ago

AlphaGaO/Qwen3-32B-GPTQ

Text Generation • Updated 2 days ago • 30 • 1

AlphaGaO/Qwen3-30B-A3B-GPTQ

Text Generation • Updated 3 days ago • 1.21k • 3

New activity in moonshotai/Moonlight-16B-A3B-Instruct 9 days ago

PEFT finetuning support

#14 opened 9 days ago by

NePe

liked a model 15 days ago

microsoft/bitnet-b1.58-2B-4T

Text Generation • Updated 1 day ago • 42.6k • 916

liked a model about 1 month ago

rasbt/llama-3.2-from-scratch

Updated 16 days ago • 269

New activity in mistralai/Mistral-Small-3.1-24B-Instruct-2503 about 1 month ago

Can't determine properly which is greater between 9.9 and 9.11

#38 opened about 1 month ago by

sniffski

liked a model about 1 month ago

mistralai/Mistral-Small-3.1-24B-Instruct-2503

Image-Text-to-Text • Updated 24 days ago • 84.1k • • 1.19k

liked a model 7 months ago

mistralai/Mistral-Small-Instruct-2409

Updated Oct 16, 2024 • 6.96k • 384

upvoted 2 papers 9 months ago

Eagle and Finch: RWKV with Matrix-Valued States and Dynamic Recurrence

Paper • 2404.05892 • Published Apr 8, 2024 • 39

GoldFinch: High Performance RWKV/Transformer Hybrid with Linear Pre-Fill and Extreme KV-Cache Compression

Paper • 2407.12077 • Published Jul 16, 2024 • 57

New activity in google/gemma-2-27b-it 10 months ago

with load_in_4bit it just generates <pad> tokens

#16 opened 10 months ago by

NePe

New activity in rainjay/gemma-2-27b-it-4bit 10 months ago

it just keeps generating <pad> tokens

#1 opened 10 months ago by

NePe

liked a model 10 months ago

rainjay/gemma-2-27b-it-4bit

Text Generation • Updated Jun 28, 2024 • 2 • 3

reacted to santiviquez's post with 🔥 11 months ago

Post

1568

I ran 580 experiments (yes, 580 🤯) to check if we can quantify data drift's impact on model performance using only drift metrics.

For these experiments, I built a technique that relies on drift signals to estimate model performance. I compared its results against the current SoTA performance estimation methods and checked which technique performs best.

The plot below summarizes the general results. It measures the quality of performance estimation versus the absolute performance change. (The lower, the better).

Full experiment: https://www.nannyml.com/blog/data-drift-estimate-model-performance

In it, I describe the setup, datasets, models, benchmarking methods, and the code used in the project.

liked a model 11 months ago

CohereLabs/aya-23-35B

Text Generation • Updated 16 days ago • 3.67k • 279

reacted to andrewrreed's post with ❤️ 12 months ago

Post

2623

🔬 Open LLM Progress Tracker 🔬

Inspired by the awesome work from @mlabonne , I created a Space to monitor the narrowing gap between open and proprietary LLMs as scored by the LMSYS Chatbot Arena ELO ratings 🤗

The goal is to have a continuously updated place to easily visualize these rapidly evolving industry trends 🚀

🔗 Open LLM Progress Tracker: andrewrreed/closed-vs-open-arena-elo
🔗 Source of Inspiration: https://www.linkedin.com/posts/maxime-labonne_arena-elo-graph-updated-with-new-models-activity-7187062633735368705-u2jB/

2 replies