NeoChen1024
/

dolphin-2.9.4-llama3.1-8b-GGUF

Model card Files Files and versions Community

NeoChen1024 commited on Sep 25, 2024

Commit

ca73e4f

·

verified ·

1 Parent(s): 2f7fc75

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -8,6 +8,6 @@ base_model:
 IQ4_XS (4.2G,  8.7992 +/- 0.11237, fits into 8GiB VRAM + 4096 context with F16 KV cache)
 Q4_K_M (4.6G,  8.7948 +/- 0.11223, fits into 8GiB VRAM + 4096 context with F16 KV cache, also good for CPU inference on E5-26xx v3/v4)
 Q8_0   (8.0G,  8.5970 +/- 0.10933, imatrix derived from it)
-F16    ( 15G,  8.6617 +/- 0.11043, for 24GiB VRAM, imatrix derived from it)
 ```
 Perplexity measured with `-fa -c 2048 -ub 2048` on UTF-8 text version of ["Wired Love" from Project Gutenberg](http://www.gutenberg.org/ebooks/24353).

 IQ4_XS (4.2G,  8.7992 +/- 0.11237, fits into 8GiB VRAM + 4096 context with F16 KV cache)
 Q4_K_M (4.6G,  8.7948 +/- 0.11223, fits into 8GiB VRAM + 4096 context with F16 KV cache, also good for CPU inference on E5-26xx v3/v4)
 Q8_0   (8.0G,  8.5970 +/- 0.10933, imatrix derived from it)
+F16    ( 15G,  8.6617 +/- 0.11043, for 24GiB VRAM)
 ```
 Perplexity measured with `-fa -c 2048 -ub 2048` on UTF-8 text version of ["Wired Love" from Project Gutenberg](http://www.gutenberg.org/ebooks/24353).