cicdatopea
commited on
Update README.md
Browse files
README.md
CHANGED
@@ -85,8 +85,30 @@ Please follow the [Build llama.cpp locally](https://github.com/ggerganov/llama.c
|
|
85 |
|
86 |
**5*80G gpu is needed(could optimize), 1.4T cpu memory is needed**
|
87 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
88 |
pip3 install git+https://github.com/intel/auto-round.git
|
89 |
|
|
|
90 |
```python
|
91 |
import torch
|
92 |
from transformers import AutoModelForCausalLM, AutoTokenizer
|
|
|
85 |
|
86 |
**5*80G gpu is needed(could optimize), 1.4T cpu memory is needed**
|
87 |
|
88 |
+
**1 add meta data to bf16 model** https://huggingface.co/opensourcerelease/DeepSeek-V3-bf16
|
89 |
+
|
90 |
+
```python
|
91 |
+
import safetensors
|
92 |
+
from safetensors.torch import save_file
|
93 |
+
|
94 |
+
for i in range(1, 164):
|
95 |
+
idx_str = "0" * (5-len(str(i))) + str(i)
|
96 |
+
safetensors_path = f"model-{idx_str}-of-000163.safetensors"
|
97 |
+
print(safetensors_path)
|
98 |
+
tensors = dict()
|
99 |
+
with safetensors.safe_open(safetensors_path, framework="pt") as f:
|
100 |
+
for key in f.keys():
|
101 |
+
tensors[key] = f.get_tensor(key)
|
102 |
+
save_file(tensors, safetensors_path, metadata={'format': 'pt'})
|
103 |
+
```
|
104 |
+
|
105 |
+
**2 replace the modeling_deepseek.py with the following file**, basically align device and remove torch.no_grad as we need some tuning in AutoRound.
|
106 |
+
|
107 |
+
https://github.com/intel/auto-round/blob/deepseekv3/modeling_deepseek.py
|
108 |
+
|
109 |
pip3 install git+https://github.com/intel/auto-round.git
|
110 |
|
111 |
+
**3 tuning**
|
112 |
```python
|
113 |
import torch
|
114 |
from transformers import AutoModelForCausalLM, AutoTokenizer
|