Qwen
/

Transformers
GGUF
conversational
littlebird13 commited on
Commit
e9f76f6
·
verified ·
1 Parent(s): b6feb63

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -6
README.md CHANGED
@@ -62,18 +62,14 @@ In the following demonstration, we assume that you are running commands under th
62
  You can run Qwen3 Embedding with one command:
63
 
64
  ```shell
65
- ./build/bin/llama-embedding -m model.gguf -p "<your context here><|endoftext|>" --pooling last --verbose-prompt --embd-normalize 2
66
  ```
67
 
68
- Or lunch a server:
69
  ```shell
70
  ./build/bin/llama-server -m model.gguf --embedding --pooling last -ub 8192 --verbose-prompt
71
  ```
72
 
73
- 📌 **Tip**: Qwen3 Embedding models default to using the last token as `<|endoftext|>`, so you need to manually append this token to the end of your own input context. In addition, when running the `llama-server`, you also need to manually normalize the output embeddings as `llama-server` currently does not support the `--embd-normalize` option.
74
-
75
-
76
-
77
 
78
  ## Evaluation
79
 
 
62
  You can run Qwen3 Embedding with one command:
63
 
64
  ```shell
65
+ ./build/bin/llama-embedding -m model.gguf -p "<your context here>" --pooling last --verbose-prompt
66
  ```
67
 
68
+ Or launch a server:
69
  ```shell
70
  ./build/bin/llama-server -m model.gguf --embedding --pooling last -ub 8192 --verbose-prompt
71
  ```
72
 
 
 
 
 
73
 
74
  ## Evaluation
75