Steve Chen
stev236
AI & ML interests
Running local models on different projects.
Recent Activity
new activity
17 days ago
Qwen/Qwen3-8B:New 8B model much slower than old 7B model when running on vLLM.
new activity
17 days ago
Qwen/Qwen3-4B:Why are the new 4B and 8B models slower than the previous 7B-1M model??
new activity
18 days ago
Qwen/Qwen3-14B:Long context: YaRN max_position_embeddings 32K or 40k?
Organizations
None yet
stev236's activity
New 8B model much slower than old 7B model when running on vLLM.
1
#6 opened 17 days ago
by
stev236
Why are the new 4B and 8B models slower than the previous 7B-1M model??
3
#6 opened 17 days ago
by
stev236
Long context: YaRN max_position_embeddings 32K or 40k?
➕
1
2
#10 opened 18 days ago
by
stev236
vLLM example for 'Offline' should include an input image.
❤️
1
2
#47 opened about 2 months ago
by
stev236
vLLM example for 'Offline' should include an input image.
❤️
1
2
#47 opened about 2 months ago
by
stev236