Modicum-of-Doubt-v1-24B-4bpw-h6-exl3

This is a quant of merge of pre-trained language models created using mergekit.

Exllamav3 was used to create a quant at 4bpw with h6. With 16GB VRAM, it's possible to run 16K context at fp16 with some room to spare.

The model vision component was excised from all merge contributions.

Creative text generation outputs seem to trend toward the short side, sometimes to the point of feeling choppy, hence the model name. This model is not the most stellar, but the result is interesting, going against the individual tendency of the two contributing models toward longer outputs.

Tested sampler settings: temperature 1.0, minP 0.02

Merge Details

Merge Method

This model was merged using the Task Arithmetic merge method using mrfakename/mistral-small-3.1-24b-base-2503-hf as a base.

Models Merged

The following models were included in the merge:

Configuration

The following YAML configuration was used to produce this model:

base_model: mrfakename/mistral-small-3.1-24b-base-2503-hf
dtype: bfloat16
merge_method: task_arithmetic
parameters:
  normalize: true
models:
  - model: mrfakename/mistral-small-3.1-24b-base-2503-hf
  - model: Doctor-Shotgun/MS3.2-24B-Magnum-Diamond
    parameters:
      weight: 0.5
  - model: PocketDoc/Dans-PersonalityEngine-V1.3.0-24b
    parameters:
      weight: 0.5
Downloads last month
-
Safetensors
Model size
6.49B params
Tensor type
F16
·
I16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for grimjim/Modicum-of-Doubt-v1-24B-4bpw-h6-exl3

Quantized
(2)
this model