huihui-ai commited on
Commit
ffa2eed
·
verified ·
1 Parent(s): a2317e2

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +31 -0
README.md CHANGED
@@ -15,3 +15,34 @@ This model converted from DeepSeek-V3 to BF16.
15
  Here we simply provide the conversion command and related information about ollama.
16
  If needed, we can upload the bf16 version.
17
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
15
  Here we simply provide the conversion command and related information about ollama.
16
  If needed, we can upload the bf16 version.
17
 
18
+ ## FP8 to BF16
19
+ 1. Download [deepseek-ai/DeepSeek-V3](https://huggingface.co/deepseek-ai/DeepSeek-V3) model, requires approximately 641GB of space.
20
+ ```
21
+ cd /home/admin/models
22
+ huggingface-cli download deepseek-ai/DeepSeek-V3 --local-dir ./deepseek-ai/DeepSeek-V3
23
+ ```
24
+ 2. Create the environment.
25
+ ```
26
+ conda create -yn DeepSeek-V3 python=3.12
27
+ conda activate DeepSeek-V3
28
+ pip install -r requirements.txt
29
+ ```
30
+ 3. Convert to BF16, requires an additional approximately 1.3 TB of space.
31
+ ```
32
+ cd deepseek-ai/DeepSeek-V3/inference
33
+ python fp8_cast_bf16.py --input-fp8-hf-path /home/admin/models/deepseek-ai/DeepSeek-V3/ --output-bf16-hf-path /home/admin/models/deepseek-ai/DeepSeek-V3-bf16
34
+ ```
35
+ ## BF16 to f16.gguf
36
+ 1. Use the [llama.cpp](https://github.com/ggerganov/llama.cpp) conversion program to convert DeepSeek-V3-bf16 to gguf format, requires an additional approximately 1.3 TB of space.
37
+ **Note:** this model requires [Ollama 0.5.5](https://github.com/ollama/ollama/releases/tag/v0.5.5)
38
+ ```
39
+ python convert_hf_to_gguf.py /home/admin/models/deepseek-ai/DeepSeek-V3-bf16 --outfile /home/admin/models/deepseek-ai/DeepSeek-V3-bf16/ggml-model-f16.gguf --outtype f16
40
+ ```
41
+ 2. Use the [llama.cpp](https://github.com/ggerganov/llama.cpp) quantitative program to quantitative model (llama-quantize needs to be compiled.),
42
+ other [quant option](https://github.com/ggerganov/llama.cpp/blob/master/examples/quantize/quantize.cpp).
43
+ Convert first Q2_K.
44
+ ```
45
+ llama-quantize /home/admin/models/deepseek-ai/DeepSeek-V3-bf16/ggml-model-f16.gguf /home/admin/models/deepseek-ai/DeepSeek-V3-bf16/ggml-model-Q2_K.gguf Q2_K
46
+ ```
47
+ 3. Use llama-cli to test.
48
+ llama-cli -m /home/admin/models/deepseek-ai/DeepSeek-V3-bf16/ggml-model-Q2_K.gguf -n 2048