mistral-nemo-gutenberg-12B-v4-exl2
This repository contains various EXL2 quantisations of nbeerbower/mistral-nemo-gutenberg-12B-v4.
Quantisations available:
Branch | Description | Recommended |
---|---|---|
2.0-bpw | 2 bits per weight | Low Quality - Smallest Available Quantisation |
3.0-bpw | 3 bits per weight | |
4.0-bpw | 4 bits per weight | ✔️ - Recommended for Low-VRAM Environments |
5.0-bpw | 5 bits per weight | |
6.0-bpw | 6 bits per weight | ✔️ - Best Quality / VRAM Balance |
6.5-bpw | 6.5 bits per weight | ✔️ - Near Perfect Quality, Slightly Higher VRAM Usage |
8.0-bpw | 8.0 bits per weight | Best Available Quality - Almost always unnecessary |
ORIGINAL README:
TheDrummer/Rocinante-12B-v1 finetuned on jondurbin/gutenberg-dpo-v0.1.
Method
Finetuned using an A100 on Google Colab for 3 epochs.
- Downloads last month
- 55
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.
Model tree for CameronRedmore/mistral-nemo-gutenberg-12B-v4-exl2
Base model
TheDrummer/Rocinante-12B-v1