ElectraEXTRA

Like Electranova but with a different model, so the thinking works better in it. The writing quality is also better imo.

Settings:

Samplers: With thinking: Temp 1.05, top nsigma 0.7; w/o: Temp 1.15, top nsigma 0.7, minP 0.02, smoothing factor 0.3, smoothing curve 2

Sys. prompt: LeCeption or the one from here

Quants

Static: https://huggingface.co/mradermacher/L3.3-ElectraEXTRA-R1-70b-GGUF

Weighted/imatrix: https://huggingface.co/mradermacher/L3.3-ElectraEXTRA-R1-70b-i1-GGUF

Merge Details

Merge Method

This model was merged using the SCE merge method using Steelskull/L3.3-Electra-R1-70b as a base.

Models Merged

The following models were included in the merge:

Configuration

The following YAML configuration was used to produce this model:

models:
  - model: Sao10K/Llama-3.3-70B-Vulpecula-r1
    parameters:
      select_topk:
        - filter: self_attn
          value: 0.1
        - filter: "q_proj|k_proj|v_proj"
          value: 0.1
        - filter: "up_proj|down_proj"
          value: 0.1
        - filter: mlp
          value: 0.1
        - value: 0.1  # default for other components
  - model: Nohobby/L3.3-Prikol-70B-EXTRA
    parameters:
      select_topk:
        - filter: self_attn
          value: 0.15
        - filter: "q_proj|k_proj|v_proj"
          value: 0.1
        - filter: "up_proj|down_proj"
          value: 0.1
        - filter: mlp
          value: 0.1
        - value: 0.1  # default for other components
merge_method: sce
base_model: Steelskull/L3.3-Electra-R1-70b
dtype: float32
out_dtype: bfloat16
tokenizer:
  source: Steelskull/L3.3-Electra-R1-70b
Downloads last month
10
Safetensors
Model size
70.6B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Yobenboben/L3.3-ElectraEXTRA-R1-70b