InferenceIllusionist
/

Mistral-Nemo-Instruct-12B-iMat-GGUF

Inference Endpoints

Model card Files Files and versions Community

InferenceIllusionist commited on Jul 21, 2024

Commit

c992d70

·

verified ·

1 Parent(s): a13fc0a

Update README.md

Files changed (1) hide show

README.md +3 -4

README.md CHANGED Viewed

@@ -1,5 +1,5 @@
 ---
-base_model: mistralai/mathstral-7B-v0.1
 library_name: transformers
 quantized_by: InferenceIllusionist
 language:
@@ -16,7 +16,6 @@ tags:
 - iMat
 - gguf
 - Mistral
-- Math
 license: apache-2.0
 ---
 <img src="https://i.imgur.com/P68dXux.png" width="400"/>
@@ -25,11 +24,11 @@ license: apache-2.0
 <b>Important Note: Inferencing is *only* available on this fork of llama.cpp at the moment: https://github.com/ggerganov/llama.cpp/pull/8604
-Other front-ends like the main branch of llama.cpp, kobold.cpp, and text-generation-web-ui may not work as intended.</b>
 Quantized from fp16.
 * Weighted quantizations were creating using fp16 GGUF and groups_merged.txt in 92 chunks and n_ctx=512
-* Static fp16 also included in repo
 For a brief rundown of iMatrix quant performance please see this [PR](https://github.com/ggerganov/llama.cpp/pull/5747)

 ---
+base_model: mistralai/Mistral-Nemo-Instruct-2407
 library_name: transformers
 quantized_by: InferenceIllusionist
 language:
 - iMat
 - gguf
 - Mistral
 license: apache-2.0
 ---
 <img src="https://i.imgur.com/P68dXux.png" width="400"/>
 <b>Important Note: Inferencing is *only* available on this fork of llama.cpp at the moment: https://github.com/ggerganov/llama.cpp/pull/8604
+Other front-ends like the main branch of llama.cpp, kobold.cpp, and text-generation-web-ui may not work as intended</b>
 Quantized from fp16.
 * Weighted quantizations were creating using fp16 GGUF and groups_merged.txt in 92 chunks and n_ctx=512
+* Static fp16 will also be included in repo
 For a brief rundown of iMatrix quant performance please see this [PR](https://github.com/ggerganov/llama.cpp/pull/5747)