RedHatAI/Llama-4-Scout-17B-16E-Instruct-FP8-dynamic · Discussions

Resources

View closed (1)

🪄 InferenceService name updated

#8 opened 21 days ago by

change-name

#7 opened 25 days ago by

Overview states 109b, should be 17b

#6 opened 27 days ago by

Failing to quantize using your method

#4 opened 3 months ago by

VLLM launch parametrs

#3 opened 4 months ago by

Why not FP8 with static and per-tensor quantization?

#2 opened 4 months ago by

Thank you uploading this.

#1 opened 4 months ago by

chriswritescode