cerebras outputting <think>
1
#26 opened about 13 hours ago
by
therealkenc

π[Fine-tuning] 8x80GiB GPUs LoRA finetuning Qwen3-235B-A22B-Instruct-2507
π€
3
#25 opened 1 day ago
by
study-hjt

π Evaluation Best Practice !
π
3
#24 opened 1 day ago
by
Yunxz
int4 and awq version
1
#23 opened 1 day ago
by
devops724
Tokenizer template is wrong?
#22 opened 2 days ago
by
eugenhotaj-ppl
Update README.md
#21 opened 2 days ago
by
EtherAI
Update README.md
#20 opened 2 days ago
by
csabakecskemeti

download on kaggle
#19 opened 2 days ago
by
malik33
An interesting phenomenon
2
#18 opened 2 days ago
by
Shuaiqi
Can this Run on 5090 64gb ram and 9950x3d
1
#17 opened 3 days ago
by
GrimReaper000

Good idea to remove the hybrid thinking mode
π
4
1
#16 opened 3 days ago
by
rtzurtz
Why not introduce the 235b-2507 inference model
#15 opened 3 days ago
by
xldistance
What is GPT-4o-0327?
#14 opened 3 days ago
by
zml24
Jinja template fails on llama.cpp and has think tags for non-thinking model
#13 opened 3 days ago
by
sirus

Smaller models update?
β
π
10
4
#12 opened 3 days ago
by
snapo
Failed to do function calling by qwe-3-235b-a22b-2507 provided by openrouter
#11 opened 3 days ago
by
LucyU2001
Does this version support yarn context extension?
#10 opened 3 days ago
by
rentianyue
4 bit quantisation release?
β
9
1
#9 opened 3 days ago
by
mochiyo
[Experiment] Confirmed by Arc Prize
1
#8 opened 3 days ago
by
clem

Just admit you train on the benchmark datasets
π
π
19
8
#7 opened 3 days ago
by
ChuckMcSneed

Review and Testing Video - Step by Step
#6 opened 3 days ago
by
fahdmirzac

Ensure Cerebras, Groq, and SambaNova support this.
β
6
#5 opened 3 days ago
by
AntDX316

SimpleQA jumped from 12.2 to 54.3?
π₯
π§
23
25
#4 opened 3 days ago
by
phil111
Update README.md to fix invalid yaml
π
1
1
#3 opened 3 days ago
by
neilmehta24

Base model
β
11
#2 opened 3 days ago
by
NyxKrage

Small Models
π
18
4
#1 opened 3 days ago
by
PSM24