hugging-quants
/

Meta-Llama-3.1-405B-Instruct-AWQ-INT4

Text Generation

text-generation-inference

4-bit precision

Model card Files Files and versions Community

Resources

View closed (12)

Update generation_config.json

#23 opened 3 months ago by

Update config.json

#22 opened 3 months ago by

num_key_value_heads=16 instead of 8 in the original model

#21 opened 7 months ago by

Fix eos_token and model_max_length in tokenizer_config

#20 opened 8 months ago by

AshtonIsNotHere

Update README.md

#19 opened 9 months ago by

MironVeryanskiy

Update tokenizer_config.json

#18 opened 10 months ago by

Running on multi-node infrastructure

#17 opened 10 months ago by

Update generation_config

#16 opened 10 months ago by

error when quantizing my finetuned 405b model using autoawq

#13 opened 10 months ago by

Atomheart-Father

Any chance of an AWQ version of the 405B base model?

#12 opened 10 months ago by

lodrick-the-lafted

Cuda failure 1 'invalid argument'

#8 opened 10 months ago by