internlm
/

internlm3-8b-instruct-gguf

Text Generation

Model card Files Files and versions Community

unsubscribe commited on Jan 15

Commit

9b7d181

·

verified ·

1 Parent(s): c3b4411

Update README.md

Files changed (1) hide show

README.md +5 -5

README.md CHANGED Viewed

@@ -48,7 +48,7 @@ For instance, `internlm3-8b-instruct-fp16.gguf` can be downloaded as below：
 ```shell
 pip install huggingface-hub
-huggingface-cli download internlm/internlm3-8b-instruct-gguf internlm3-8b-instruct-fp16.gguf --local-dir . --local-dir-use-symlinks False
 ```
 ## Inference
@@ -59,7 +59,7 @@ You can use `llama-cli` for conducting inference. For a detailed explanation of
 ```shell
 build/bin/llama-cli \
-    --model internlm3-8b-instruct-fp16.gguf  \
     --predict 512 \
     --ctx-size 4096 \
     --gpu-layers 48 \
@@ -84,7 +84,7 @@ build/bin/llama-cli \
 ```shell
 build/bin/llama-cli \
-    --model internlm3-8b-instruct-fp16.gguf \
     --predict 512 \
     --ctx-size 4096 \
     --gpu-layers 48 \
@@ -123,10 +123,10 @@ The current temperature in Shanghai is 22 degrees Celsius.<|im_end|>
 ## Serving
-`llama.cpp` provides an OpenAI API compatible server - `llama-server`. You can deploy `internlm3-8b-instruct-fp16.gguf` into a service like this:
 ```shell
-./build/bin/llama-server -m ./internlm3-8b-instruct-fp16.gguf -ngl 48
 ```
 At the client side, you can access the service through OpenAI API:

 ```shell
 pip install huggingface-hub
+huggingface-cli download internlm/internlm3-8b-instruct-gguf internlm3-8b-instruct.gguf --local-dir . --local-dir-use-symlinks False
 ```
 ## Inference
 ```shell
 build/bin/llama-cli \
+    --model internlm3-8b-instruct.gguf  \
     --predict 512 \
     --ctx-size 4096 \
     --gpu-layers 48 \
 ```shell
 build/bin/llama-cli \
+    --model internlm3-8b-instruct.gguf \
     --predict 512 \
     --ctx-size 4096 \
     --gpu-layers 48 \
 ## Serving
+`llama.cpp` provides an OpenAI API compatible server - `llama-server`. You can deploy `internlm3-8b-instruct.gguf` into a service like this:
 ```shell
+./build/bin/llama-server -m ./internlm3-8b-instruct.gguf -ngl 48
 ```
 At the client side, you can access the service through OpenAI API: