ddh0
/

Mixtral-8x7B-Instruct-v0.1-q8_0-q8_0-GGUF

Inference Endpoints

Model card Files Files and versions Community

ddh0 commited on Aug 12, 2024

Commit

ade02a6

·

verified ·

1 Parent(s): f765391

Create README.md

Files changed (1) hide show

README.md +20 -0

README.md ADDED Viewed

	@@ -0,0 +1,20 @@

+---
+language:
+- fr
+- it
+- de
+- es
+- en
+license: apache-2.0
+inference:
+  parameters:
+    temperature: 0.5
+---
+This is [mistralai/Mixtral-8x7B-Instruct-v0.1](https://huggingface.co/mistralai/Mixtral-8x7B-Instruct-v0.1), converted to GGUF and quantized to q8_0. Both the model and the embedding/output tensors are q8_0.
+The model is split using the `llama.cpp/llama-gguf-split` cli utility into shards no larger than 1GB. The purpose of this is to make it less painful to resume downloading if interrupted.
+This is uploaded pretty much just as a personal backup. Mixtral Instruct is one of my favorite models.
+All operations are done with `llama.cpp` commit [`8cd1bcfd3fc9f2b5cbafd7fb7581b3278acec25fz`](https://github.com/ggerganov/llama.cpp/tree/8cd1bcfd3fc9f2b5cbafd7fb7581b3278acec25fz) (2024-08-11).