Zheng Han's picture

23 1 3

Zheng Han

traphix

·

AI & ML interests

None yet

Recent Activity

new activity 6 days ago

chriswritescode/Qwen3-235B-A22B-Instruct-2507-AWQ-Swift:Qwen3-235B-A22B-Instruct-2507, int4-w4a16 or awq? Which one has better accuracy recovery?

new activity 10 days ago

Qwen/Qwen2.5-14B-Instruct-1M:Does vllm 0.7.3 support this model？

new activity 10 days ago

RedHatAI/README:Any plans to quantize Qwen3-235B-A22B-Instruct-2507？

View all activity

Organizations

None yet

New activity in chriswritescode/Qwen3-235B-A22B-Instruct-2507-AWQ-Swift 6 days ago

Qwen3-235B-A22B-Instruct-2507, int4-w4a16 or awq? Which one has better accuracy recovery?

#1 opened 6 days ago by

New activity in Qwen/Qwen2.5-14B-Instruct-1M 10 days ago

Does vllm 0.7.3 support this model？

#10 opened 5 months ago by

New activity in RedHatAI/README 10 days ago

Any plans to quantize Qwen3-235B-A22B-Instruct-2507？

#1 opened 10 days ago by

New activity in RedHatAI/DeepSeek-R1-0528-quantized.w4a16 about 2 months ago

Is 4 x H20 96G sufficient to run this model?

#2 opened about 2 months ago by

New activity in Qwen/Qwen3-30B-A3B-FP8 2 months ago

Remove vLLM FP8 Limitation

#2 opened 3 months ago by

New activity in RedHatAI/Qwen3-235B-A22B-FP8-dynamic 2 months ago

Error running on A100?

#4 opened 3 months ago by

New activity in RedHatAI/Qwen3-235B-A22B-FP8-dynamic 3 months ago

Any plans for int8 quantized.w8a8?

#5 opened 3 months ago by

New activity in justinjja/Qwen3-235B-A22B-INT4-W4A16 3 months ago

How about int8 quantization?

#3 opened 3 months ago by

New activity in RedHatAI/Qwen3-235B-A22B-FP8-dynamic 3 months ago

How many RAM in GBs when quantizing Qwen3-235B-A22B?

#2 opened 3 months ago by

liked a model 3 months ago

RedHatAI/Qwen3-235B-A22B-FP8-dynamic

Text Generation • 235B • Updated May 6 • 1.7k • 2

New activity in RedHatAI/Qwen3-235B-A22B-FP8-dynamic 3 months ago

Where are the safetensors?

#1 opened 3 months ago by

New activity in RedHatAI/Qwen3-32B-FP8-dynamic 3 months ago

What is the difference between Qwen/Qwen3-32B-FP8 and this quatinized model？

#1 opened 3 months ago by

liked a model 3 months ago

RedHatAI/DeepSeek-R1-quantized.w4a16

Text Generation • Updated Apr 22 • 49 • 7

New activity in RedHatAI/DeepSeek-R1-quantized.w4a16 4 months ago

Does vllm 0.8.4 support this quantized model?

#1 opened 4 months ago by

New activity in Qwen/QwQ-32B 5 months ago

This model beats Qwen Max!

#33 opened 5 months ago by

New activity in QuixiAI/DeepSeek-R1-AWQ 5 months ago

why "MLA is not supported with awq_marlin quantization. Disabling MLA." with 4090 * 32 (4 node / vllm 0.7.2)

#14 opened 5 months ago by

Is there any accuracy results comparing to original DeepSeek-R1？

#15 opened 5 months ago by

New activity in QuixiAI/DeepSeek-R1-AWQ 6 months ago

Has anyone evaluated the performance of the AWQ version of the model on benchmarks?

#8 opened 6 months ago by

skips the thinking process

#5 opened 6 months ago by

Deployment framework

#2 opened 6 months ago by