Update README.md (#8)
Browse files- Update README.md (cdab70ee37816e572312be5842328c442c0334de)
Co-authored-by: HandH1998 <[email protected]>
README.md
CHANGED
@@ -31,6 +31,21 @@ To generate this weight, run the provided script in the ``./inference`` director
|
|
31 |
python3 bf16_cast_block_int8.py --input-bf16-hf-path /path/to/bf16-weights/ --output-int8-hf-path /path/to/save-int8-weight/
|
32 |
``
|
33 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
34 |
---
|
35 |
|
36 |
# DeepSeek-R1
|
|
|
31 |
python3 bf16_cast_block_int8.py --input-bf16-hf-path /path/to/bf16-weights/ --output-int8-hf-path /path/to/save-int8-weight/
|
32 |
``
|
33 |
|
34 |
+
## 3. Trouble Shooting
|
35 |
+
|
36 |
+
Before inference, you should confirm the "quantization_config" of `config.json` in `/path/to/save-int8-weight/` should be:
|
37 |
+
|
38 |
+
``
|
39 |
+
"quantization_config": {
|
40 |
+
"activation_scheme": "dynamic",
|
41 |
+
"quant_method": "blockwise_int8",
|
42 |
+
"weight_block_size": [
|
43 |
+
128,
|
44 |
+
128
|
45 |
+
]
|
46 |
+
}
|
47 |
+
``
|
48 |
+
|
49 |
---
|
50 |
|
51 |
# DeepSeek-R1
|