Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up

hugging-quants
/
Meta-Llama-3.1-405B-Instruct-GPTQ-INT4

Text Generation
Transformers
Safetensors
llama
llama-3.1
meta
autogptq
conversational
text-generation-inference
4-bit precision
gptq
Model card Files Files and versions Community
18
New discussion
Resources
  • PR & discussions documentation
  • Code of Conduct
  • Hub documentation

Change max_position_embeddings to original value

#18 opened 7 months ago by
AshtonIsNotHere

Can you provide one model using `group_size=1024` to make the model smaller?

#15 opened 10 months ago by
shuyuej

optimum version cannot support llama3.1 405b

#14 opened 10 months ago by
Atomheart-Father

OOM Error

#13 opened 10 months ago by
shuyuej

Source codes to quantize the LLaMA 3.1 405B model

3
#10 opened 10 months ago by
shuyuej

quantization gptq_marlin (not found gptq_marlin) not work. , remove it. work.

8
#7 opened 10 months ago by
linpan

Accuracy tradeoff

#6 opened 10 months ago by
shaamil101

Value Error when trying to run

2
#4 opened 10 months ago by
itaytricks
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs