Q6_K_L 将 output.weight 和 token embeddings 指定为了 Q8_0 量化（增加约 0.27GB），模型可能对 token_embd.weight 和 output.weight 的精度更为敏感。
使用了一些不正经数据进行 imatrix，并选择了 K 系列量化方法。

原始模型

note:

简单的请求时模型可能会跳过 <think> ，可尝试修改模板强制添加至开头

Downloads last month: 314

GGUF

Model size

7.62B params

Architecture

qwen2

Hardware compatibility

4-bit

6-bit

8-bit

16-bit

Inference Providers NEW

Text Generation

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for nuofang/Tifa-DeepsexV2-7b-MGRPO-GGUF-Q6_K_L

Base model

ValueFX9507/Tifa-DeepsexV2-7b-MGRPO-GGUF-F16

Quantized

(4)

this model