Qwen
/

Qwen3-Embedding-4B-GGUF

Model card Files Files and versions

littlebird13 commited on 13 days ago

Commit

e9f76f6

·

verified ·

1 Parent(s): b6feb63

Update README.md

Files changed (1) hide show

README.md +2 -6

README.md CHANGED Viewed

@@ -62,18 +62,14 @@ In the following demonstration, we assume that you are running commands under th
 You can run Qwen3 Embedding with one command:
 ```shell
-./build/bin/llama-embedding -m model.gguf  -p "<your context here><|endoftext|>"  --pooling last --verbose-prompt --embd-normalize 2
 ```
-Or lunch a server:
 ```shell
 ./build/bin/llama-server -m model.gguf --embedding --pooling last -ub 8192 --verbose-prompt
 ```
-📌 **Tip**: Qwen3 Embedding models default to using the last token as `<|endoftext|>`, so you need to manually append this token to the end of your own input context. In addition, when running the `llama-server`, you also need to manually normalize the output embeddings as `llama-server` currently does not support the `--embd-normalize` option.
 ## Evaluation

 You can run Qwen3 Embedding with one command:
 ```shell
+./build/bin/llama-embedding -m model.gguf  -p "<your context here>"  --pooling last --verbose-prompt
 ```
+Or launch a server:
 ```shell
 ./build/bin/llama-server -m model.gguf --embedding --pooling last -ub 8192 --verbose-prompt
 ```
 ## Evaluation