bndp (bndp)

upvoted a paper about 1 month ago

RWKV-7 "Goose" with Expressive Dynamic State Evolution

Paper • 2503.14456 • Published Mar 18 • 153

liked a Space about 1 month ago

29

EasyControl Ghibli

🦀

New Ghibli EasyControl model is now released!!

liked a model about 1 month ago

POLARIS-Project/Polaris-4B-Preview

4B • Updated 23 days ago • 2.23k • 126

reacted to alandao's post with 🔥 about 1 month ago

Post

1238

Don’t give up 🔥

Do you know what I was planning to do this time last week?

I was preparing to write a report declaring that Jan Nano was a failed project because the benchmark results didn’t meet expectations.

But I thought — it can’t be. When loading the model into the app, the performance clearly felt better. So why were the benchmark results worse?

That’s when I reviewed the entire benchmark codebase and realized something fundamental: agentic or workflow-based approaches introduce a huge gap and variation when benchmarking. Jan-nano was trained with an agentic setup — it simply can’t be benchmarked using a rigid workflow-based method.

I made the necessary changes, and the model ended up performing even better than before the issues arose. Turns out the previous benchmarking method conflicted with the way the model was trained.

What if I had given up? That would’ve meant 1.5 months of training and a huge amount of company resources wasted.

But now, this is officially the most successful and biggest release for the whole team — all thanks to Jan-nano.

Menlo/Jan-nano

published a model about 2 months ago

bndp/AceReason-Nemotron-1.1-7B-Q4_K_M-GGUF

Text Generation • 8B • Updated Jun 17 • 5

updated a model about 2 months ago

bndp/AceReason-Nemotron-1.1-7B-Q4_K_M-GGUF

Text Generation • 8B • Updated Jun 17 • 5

reacted to AdinaY's post with 🔥 about 2 months ago

Post

2687

RedNote 小红书 just released their first LLM 🔥

dots.llm1.base 🪐 a 142B MoE model with only 14B active params.

rednote-hilab/dotsllm1-68246aaaaba3363374a8aa7c
✨ Base & Instruct - MIT license
✨ Trained on 11.2T non-synthetic high-quality data
✨ Competitive with Qwen2.5/3 on reasoning, code, alignment

liked 2 Spaces about 2 months ago

56

Hunyuan Turbos

💬

hunyuan-turbos模型体验

39

ERNIE X1 Turbo Demo

😻

BAIDU's Reasoning LLM, https://yiyan.baidu.com/

liked a Space 2 months ago

20

Falcon H1 Playground

🦅

Chat with Falcon-H1 models to get answers

reacted to csabakecskemeti's post with 👍 4 months ago

Post

3380

I'm collecting llama-bench results for inference with a llama 3.1 8B q4 and q8 reference models on varoius GPUs. The results are average of 5 executions.
The system varies (different motherboard and CPU ... but that probably that has little effect on the inference performance).

https://devquasar.com/gpu-gguf-inference-comparison/
the exact models user are in the page

I'd welcome results from other GPUs is you have access do anything else you've need in the post. Hopefully this is useful information everyone.

upvoted an article 4 months ago

Article

Open R1: How to use OlympicCoder locally for coding?

By

and 4 others •

Mar 20

• 62

reacted to AdinaY's post with 🔥 5 months ago

Post

2927

RWKV7-G1 0.1B 🔥 Pure RNN reasoning model released by RWKV

Model: BlinkDL/rwkv7-g1
paper: RWKV-7 "Goose" with Expressive Dynamic State Evolution (2503.14456)

✨ Apache2.0
✨ Supports 100+ languages
✨ 0.1 B runs smoothly on low power devices
✨ 0.4B/1.5B/2.9B are coming soon!!

1 reply

·

liked a model 5 months ago

mradermacher/model_requests

Updated May 3 • 102

reacted to AtAndDev's post with 👍 5 months ago

Post

4347

There seems to multiple paid apps shared here that are based on models on hf, but some ppl sell their wrappers as "products" and promote them here. For a long time, hf was the best and only platform to do oss model stuff but with the recent AI website builders anyone can create a product (really crappy ones btw) and try to sell it with no contribution to oss stuff. Please dont do this, or try finetuning the models you use...
Sorry for filling yall feed with this bs but yk...

6 replies

·

reacted to BlinkDL's post with 🔥 5 months ago

Post

11208

RWKV-6-world-v3 (+3.1T tokens) is our best multilingual 7B model as of now: BlinkDL/rwkv-6-world

It's 100% RNN and attention-free. MMLU 54.2% (previous world-v2.1 = 47.9%. note: without eval-boosting tricks such as annealing).

RWKV-7-world-v4 soon :)

1 reply

·

upvoted a collection 5 months ago

Gemma 3

Collection

4 items • Updated May 14 • 17

liked a Space 5 months ago

15

Falcon3 Mamba 7b Instruct Playground

🐍

Chat with a language model about any topic

upvoted a collection 5 months ago

Falcon3

Collection

Falcon3 family of Open Foundation Models is a set of pretrained and instruct LLMs ranging from 1B to 10B parameters. • 40 items • Updated 10 days ago • 86

reacted to fdaudens's post with 👍 5 months ago

Post

4125

AI will bring us "a country of yes-men on servers" instead of one of "Einsteins sitting in a data center" if we continue on current trends.

Must-read by @thomwolf deflating overblown AI promises and explaining what real scientific breakthroughs require.

https://thomwolf.io/blog/scientific-ai.html

2 replies

·

bndp

AI & ML interests

Recent Activity

Organizations

RWKV-7 "Goose" with Expressive Dynamic State Evolution

EasyControl Ghibli

POLARIS-Project/Polaris-4B-Preview

bndp/AceReason-Nemotron-1.1-7B-Q4_K_M-GGUF

bndp/AceReason-Nemotron-1.1-7B-Q4_K_M-GGUF

Hunyuan Turbos

ERNIE X1 Turbo Demo

Falcon H1 Playground

Open R1: How to use OlympicCoder locally for coding?

mradermacher/model_requests

Gemma 3

Falcon3 Mamba 7b Instruct Playground

Falcon3

bndp

AI & ML interests

Recent Activity

Organizations

bndp's activity

EasyControl Ghibli

Hunyuan Turbos

ERNIE X1 Turbo Demo

Falcon H1 Playground

Open R1: How to use OlympicCoder locally for coding?

Falcon3 Mamba 7b Instruct Playground