wenhuach commited on
Commit
62d5300
·
verified ·
1 Parent(s): 555c52f

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +5 -3
README.md CHANGED
@@ -6,7 +6,7 @@ pipeline_tag: text-generation
6
 
7
  ## Model Details
8
 
9
- This model is an mixed int4 model with group_size 128 and symmetric quantization of [deepseek-ai/DeepSeek-V3.1-Base](https://huggingface.co/deepseek-ai/DeepSeek-V3.1-Base) generated by [intel/auto-round](https://github.com/intel/auto-round) via RTN(no algorithm tuning).
10
  Non expert layers are fallback to 8 bits. Please refer to Section Generate the model for more details.
11
  Please follow the license of the original model.
12
 
@@ -71,7 +71,9 @@ Some of the most popular adventures include:
71
  ```
72
 
73
  ### Generate the model
74
- this pr is required if the model is fp8 and the device supports fp8 https://github.com/intel/auto-round/pull/750
 
 
75
  ```python
76
  import torch
77
  from transformers import AutoModelForCausalLM, AutoTokenizer
@@ -90,7 +92,7 @@ for n, m in model.named_modules():
90
  layer_config[n] = {"bits": 8}
91
  print(n, 8)
92
 
93
- autoround = AutoRound(model_name,, iters=0, layer_config=layer_config)
94
  autoround.quantize_and_save(format="auto_round", output_dir="tmp_autoround")
95
 
96
  ```
 
6
 
7
  ## Model Details
8
 
9
+ This model is an mixed int4 model with group_size 128 and symmetric quantization of [deepseek-ai/DeepSeek-V3.1-Base](https://huggingface.co/deepseek-ai/DeepSeek-V3.1-Base) generated by [intel/auto-round](https://github.com/intel/auto-round) **via RTN(no algorithm tuning)**.
10
  Non expert layers are fallback to 8 bits. Please refer to Section Generate the model for more details.
11
  Please follow the license of the original model.
12
 
 
71
  ```
72
 
73
  ### Generate the model
74
+
75
+ This pr is required if the model is fp8 and the device supports fp8 https://github.com/intel/auto-round/pull/750
76
+
77
  ```python
78
  import torch
79
  from transformers import AutoModelForCausalLM, AutoTokenizer
 
92
  layer_config[n] = {"bits": 8}
93
  print(n, 8)
94
 
95
+ autoround = AutoRound(model_name, iters=0, layer_config=layer_config)
96
  autoround.quantize_and_save(format="auto_round", output_dir="tmp_autoround")
97
 
98
  ```