Michael Goin's picture

Michael Goin

mgoin

·

mgoin_
mgoin

AI & ML interests

LLM inference optimization, compression, quantization, pruning, distillation

Recent Activity

updated a model about 1 hour ago

nm-testing/Mistral-Small-3.1-24B-Instruct-2503-FP8

published a model about 1 hour ago

nm-testing/Mistral-Small-3.1-24B-Instruct-2503-FP8

updated a model about 2 hours ago

nm-testing/DeepSeek-Coder-V2-Lite-Instruct-quantized.w8a8

View all activity

Organizations

mgoin's activity

New activity in RedHatAI/Llama-4-Scout-17B-16E-Instruct-FP8-dynamic 5 days ago

Why not FP8 with static and per-tensor quantization?

#2 opened 6 days ago by

New activity in mistralai/Mistral-Small-3.1-24B-Instruct-2503 9 days ago

Address discrepancies in the languages supported by the Mistral Small 3.1 2503

#54 opened 12 days ago by

New activity in RedHatAI/Mistral-Small-3.1-24B-Instruct-2503-FP8-dynamic 9 days ago

Please update the chat template

#1 opened 12 days ago by

New activity in mistralai/Mistral-Small-3.1-24B-Instruct-2503 9 days ago

FP8 Dynamic/W8A16 Quants Please

#44 opened 20 days ago by

New activity in mistralai/Mistral-Small-3.1-24B-Instruct-2503 13 days ago

Problem hosting the model using vllm

#45 opened 20 days ago by

New activity in RedHatAI/Qwen2.5-VL-72B-Instruct-quantized.w8a8 about 1 month ago

Remove image_processor_type

#1 opened about 1 month ago by

pooya-davoodi-parasail

New activity in RedHatAI/Qwen2.5-VL-7B-Instruct-quantized.w8a8 about 1 month ago

Remove image_processor_type

#1 opened about 1 month ago by

pooya-davoodi-parasail

New activity in RedHatAI/Qwen2.5-VL-72B-Instruct-FP8-Dynamic about 2 months ago

Remove image_processor_type

#2 opened about 2 months ago by

pooya-davoodi-parasail

New activity in RedHatAI/Qwen2.5-VL-7B-Instruct-FP8-Dynamic about 2 months ago

Use Qwen2VLImageProcessor for image_processor_type

#2 opened about 2 months ago by

pooya-davoodi-parasail

Use Qwen2VLImageProcessor for image_processor_type

#3 opened about 2 months ago by

pooya-davoodi-parasail

New activity in cognitivecomputations/DeepSeek-R1-AWQ 2 months ago

when i use vllm v0.7.2 to deploy r1 awq, i got empty content

#10 opened 2 months ago by

MLA is not supported with moe_wna16 quantization. Disabling MLA.

#7 opened 2 months ago by

New activity in RedHatAI/gemma-2-9b-it-FP8 2 months ago

AttributeError: 'Gemma2Config' object has no attribute 'interleaved_sliding_window' Traceback (most recent call last):

#3 opened 2 months ago by

New activity in RedHatAI/granite-3.1-8b-instruct-FP8-dynamic 2 months ago

compressed-tensors MLA support requires fp8 activations and weights in group 'group_0',

#1 opened 2 months ago by

New activity in RedHatAI/Meta-Llama-3-8B-Instruct-FP8-KV 3 months ago

How to load this model?

#1 opened 10 months ago by

New activity in RedHatAI/Llama-3.2-90B-Vision-Instruct-FP8-dynamic 4 months ago

Model does not run with VLLM

#3 opened 4 months ago by

New activity in nm-testing/Llama-3.3-70B-Instruct-FP8-dynamic 4 months ago

Nice model, any info on scripts used to quantize?

#1 opened 4 months ago by

New activity in RedHatAI/Llama-3.2-11B-Vision-Instruct-FP8-dynamic 4 months ago

Thanks!

#2 opened 4 months ago by

New activity in mistralai/Pixtral-Large-Instruct-2411 5 months ago

Update config.json to use null for sliding_window

#4 opened 5 months ago by

New activity in mgoin/nemotron-3-8b-chat-4k-sft-hf 5 months ago

Adding `safetensors` variant of this model

#1 opened 5 months ago by