fbaldassarri commited on
Commit
bd34b89
·
verified ·
1 Parent(s): afef4da

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -3
README.md CHANGED
@@ -38,7 +38,7 @@ quantized_by: fbaldassarri
38
  Quantized version of [meta-llama/Llama-3.2-1B](https://huggingface.co/meta-llama/Llama-3.2-1B) using torch.float32 for quantization tuning.
39
  - 8 bits (INT8)
40
  - group size = 128
41
- - Symmetrical Quantization
42
  - Method AutoGPTQ
43
 
44
  Quantization framework: [Intel AutoRound](https://github.com/intel/auto-round)
@@ -72,10 +72,10 @@ pip install -vvv --no-build-isolation -e .[cpu]
72
  model = AutoModelForCausalLM.from_pretrained(model_name)
73
  tokenizer = AutoTokenizer.from_pretrained(model_name)
74
  from auto_round import AutoRound
75
- bits, group_size, sym, device, amp = 8, 128, True, 'cpu', False
76
  autoround = AutoRound(model, tokenizer, nsamples=128, iters=200, seqlen=512, batch_size=4, bits=bits, group_size=group_size, sym=sym, device=device, amp=amp)
77
  autoround.quantize()
78
- output_dir = "./AutoRound/meta-llama_Llama-3.2-1B-auto_gptq-int8-gs128-sym"
79
  autoround.save_quantized(output_dir, format='auto_gptq', inplace=True)
80
  ```
81
 
 
38
  Quantized version of [meta-llama/Llama-3.2-1B](https://huggingface.co/meta-llama/Llama-3.2-1B) using torch.float32 for quantization tuning.
39
  - 8 bits (INT8)
40
  - group size = 128
41
+ - Asymmetrical Quantization
42
  - Method AutoGPTQ
43
 
44
  Quantization framework: [Intel AutoRound](https://github.com/intel/auto-round)
 
72
  model = AutoModelForCausalLM.from_pretrained(model_name)
73
  tokenizer = AutoTokenizer.from_pretrained(model_name)
74
  from auto_round import AutoRound
75
+ bits, group_size, sym, device, amp = 8, 128, False, 'cpu', False
76
  autoround = AutoRound(model, tokenizer, nsamples=128, iters=200, seqlen=512, batch_size=4, bits=bits, group_size=group_size, sym=sym, device=device, amp=amp)
77
  autoround.quantize()
78
+ output_dir = "./AutoRound/meta-llama_Llama-3.2-1B-auto_gptq-int8-gs128-asym"
79
  autoround.save_quantized(output_dir, format='auto_gptq', inplace=True)
80
  ```
81