Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up

hugging-quants
/
Meta-Llama-3.1-405B-Instruct-AWQ-INT4

Text Generation
Transformers
Safetensors
llama
llama-3.1
meta
autoawq
conversational
text-generation-inference
4-bit precision
awq
Model card Files Files and versions Community
23
New discussion
Resources
  • PR & discussions documentation
  • Code of Conduct
  • Hub documentation

Update generation_config.json

#23 opened 3 months ago by
cdumitrascu

Update config.json

#22 opened 3 months ago by
cdumitrascu

num_key_value_heads=16 instead of 8 in the original model

#21 opened 7 months ago by
Melody32768

Fix eos_token and model_max_length in tokenizer_config

#20 opened 8 months ago by
AshtonIsNotHere

Update README.md

#19 opened 9 months ago by
MironVeryanskiy

Update tokenizer_config.json

#18 opened 10 months ago by
sbranco

Running on multi-node infrastructure

#17 opened 10 months ago by
pvalois

Update generation_config

3
#16 opened 10 months ago by
DeepStack

error when quantizing my finetuned 405b model using autoawq

👀 1
16
#13 opened 10 months ago by
Atomheart-Father

Any chance of an AWQ version of the 405B base model?

2
#12 opened 10 months ago by
lodrick-the-lafted

Cuda failure 1 'invalid argument'

#8 opened 10 months ago by
JulianGerhard
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs