unsubscribe
commited on
Update README.md
Browse files
README.md
CHANGED
@@ -48,7 +48,7 @@ For instance, `internlm3-8b-instruct-fp16.gguf` can be downloaded as below:
|
|
48 |
|
49 |
```shell
|
50 |
pip install huggingface-hub
|
51 |
-
huggingface-cli download internlm/internlm3-8b-instruct-gguf internlm3-8b-instruct
|
52 |
```
|
53 |
|
54 |
## Inference
|
@@ -59,7 +59,7 @@ You can use `llama-cli` for conducting inference. For a detailed explanation of
|
|
59 |
|
60 |
```shell
|
61 |
build/bin/llama-cli \
|
62 |
-
--model internlm3-8b-instruct
|
63 |
--predict 512 \
|
64 |
--ctx-size 4096 \
|
65 |
--gpu-layers 48 \
|
@@ -84,7 +84,7 @@ build/bin/llama-cli \
|
|
84 |
|
85 |
```shell
|
86 |
build/bin/llama-cli \
|
87 |
-
--model internlm3-8b-instruct
|
88 |
--predict 512 \
|
89 |
--ctx-size 4096 \
|
90 |
--gpu-layers 48 \
|
@@ -123,10 +123,10 @@ The current temperature in Shanghai is 22 degrees Celsius.<|im_end|>
|
|
123 |
|
124 |
## Serving
|
125 |
|
126 |
-
`llama.cpp` provides an OpenAI API compatible server - `llama-server`. You can deploy `internlm3-8b-instruct
|
127 |
|
128 |
```shell
|
129 |
-
./build/bin/llama-server -m ./internlm3-8b-instruct
|
130 |
```
|
131 |
|
132 |
At the client side, you can access the service through OpenAI API:
|
|
|
48 |
|
49 |
```shell
|
50 |
pip install huggingface-hub
|
51 |
+
huggingface-cli download internlm/internlm3-8b-instruct-gguf internlm3-8b-instruct.gguf --local-dir . --local-dir-use-symlinks False
|
52 |
```
|
53 |
|
54 |
## Inference
|
|
|
59 |
|
60 |
```shell
|
61 |
build/bin/llama-cli \
|
62 |
+
--model internlm3-8b-instruct.gguf \
|
63 |
--predict 512 \
|
64 |
--ctx-size 4096 \
|
65 |
--gpu-layers 48 \
|
|
|
84 |
|
85 |
```shell
|
86 |
build/bin/llama-cli \
|
87 |
+
--model internlm3-8b-instruct.gguf \
|
88 |
--predict 512 \
|
89 |
--ctx-size 4096 \
|
90 |
--gpu-layers 48 \
|
|
|
123 |
|
124 |
## Serving
|
125 |
|
126 |
+
`llama.cpp` provides an OpenAI API compatible server - `llama-server`. You can deploy `internlm3-8b-instruct.gguf` into a service like this:
|
127 |
|
128 |
```shell
|
129 |
+
./build/bin/llama-server -m ./internlm3-8b-instruct.gguf -ngl 48
|
130 |
```
|
131 |
|
132 |
At the client side, you can access the service through OpenAI API:
|