File size: 8,898 Bytes

158ea51

---
license: apache-2.0
language:
- en
- fr
- zh
- de
tags:
- creative
- creative writing
- fiction writing
- plot generation
- sub-plot generation
- fiction writing
- story generation
- scene continue
- storytelling
- fiction story
- science fiction
- romance
- all genres
- story
- writing
- vivid prose
- vivid writing
- fiction
- roleplaying
- bfloat16
- swearing
- rp
- qwen3
- horror
- finetune
- merge
base_model:
- prithivMLmods/Cetus-Qwen3_4B-GeneralThought
- sam-paech/Qwen3-4B-antislop-exp15
- Goekdeniz-Guelmez/Josiefied-Qwen3-4B-abliterated-v2
- Qwen/Qwen3-4B
pipeline_tag: text-generation
---

<h2>Qwen3 - 4B - Fiction on Fire - Series 7, Model 1000</h2>

<img src="fiction-on-fire.jpg" style="float:right; width:300px; height:300px; padding:10px;">

A Qwen3 4B high precision merge, modified with random pruning specifically to alter prose / creativity and make the model
perform better as well as add some "knowledge" to it.

Random pruning (density) makes each model in the series unique.

A reference example prompt/generation is included below, which is used across all repos of this model/series and between series (there are two - Series 6 and 7).
Each series consists of 8 models, 1000 to 1007 and a ninth "X" model which the merge of models was done without density/pruning.

Access the series/models via "collections" on the right "Fiction on Fire" on the lower right.

You can use any model(s), use them for merging, merge them with themselves and so on.

This model uses "Josiefied-Qwen3-4B-abliterated-v2" as a base, to reduce/prevent refusals - ie decensoring it.

That being said, because of how pruning/merging works in this series, the "refusals" may not be 100% and vary between models in the series.

Each model in the series will operate / perform differently... sometimes this is minor other times major.

The reference example generation will show some of these differences - including instruction following, reasoning, bias, prose
and other structural changes.

Reasoning is fully intact and functioning on all models in both series.

All models in the series have been tested (quanted - prompts and generations) prior to uploading.

Note that series 1,2,3,4 and 5 did not "meet the grade" and therefore not uploaded.

Special thanks to the model makers ("model tree") and to Mergekit.

Requires:
- Chatml or Jinja template (embeded)
- Temp range 0 to 5.
- Rep pen range 1 to 1.1
- System prompt (optional) below.
- Context is 40k / 40000. 

Suggested Settings:
- temp .4 to 2.5
- temp .2 to .8 for specific reasoning tasks / non creative tasks.
- rep pen 1.05
- top k: 100, topp .95, minp .05
- context of 8k at least.
- Other samplers/parameters as required.
- See other Qwen recommended settings at the repo below.

Maximum context length can be altered using "yarn". See the Qwen repo for instructions. Note that changing
this will alter model performance. For creative use cases, changing this will elongate output generation (including prose changes) and
in some cases reasoning too.

System Prompt (you may or may not need this):

```
You are a deep thinking AI, you may use extremely long chains of thought to deeply consider the problem and deliberate with yourself via systematic reasoning processes to help come to a correct solution prior to answering. You should enclose your thoughts and internal monologue inside <think> </think> tags, and then provide your solution or response to the problem.
```

To turn "thinking" on/off and view other features see the main Qwen 3 Repo here:

https://huggingface.co/Qwen/Qwen3-4B

Merge Formula:

```
models:
  - model: prithivMLmods/Cetus-Qwen3_4B-GeneralThought
    parameters:
      weight: [1,1,.75,.5,.25,.25,.05,.01]
      density: .8
  - model: sam-paech/Qwen3-4B-antislop-exp15
    parameters:
      weight: [0,0,.25,.35,.4,.25,.30,.04]
      density: .6
  - model: Goekdeniz-Guelmez/Josiefied-Qwen3-4B-abliterated-v2
    parameters:
      weight: [0,0,0,.15,.35,.5,.65,.95]
      density: .8
merge_method: dare_ties
base_model: Goekdeniz-Guelmez/Josiefied-Qwen3-4B-abliterated-v2
dtype: bfloat16
```

NOTES:
- to reproduce this model you need to use "mergekit" and set the "random seed" to the "model #" (ie if the model is 1000, set the seed to 1000)
- to produce different variations change the "random seed" value.
- to change the pruning level, change the density (higher=less pruning)
- you can interchange the model positions, including "base".
- this formula is highly variable.

Get Mergekit here:

https://github.com/arcee-ai/mergekit

---

<B>Settings: CHAT / ROLEPLAY and/or SMOOTHER operation of this model:</B>

In "KoboldCpp" or  "oobabooga/text-generation-webui" or "Silly Tavern" ;

Set the "Smoothing_factor" to 1.5 to 2.5 

: in KoboldCpp -> Settings->Samplers->Advanced-> "Smooth_F"

: in text-generation-webui -> parameters -> lower right.

: In Silly Tavern this is called: "Smoothing"


NOTE: For "text-generation-webui" 

-> if using GGUFs you need to use "llama_HF" (which involves downloading some config files from the SOURCE version of this model)

Source versions (and config files) of my models are here:

https://huggingface.co/collections/DavidAU/d-au-source-files-for-gguf-exl2-awq-gptq-hqq-etc-etc-66b55cb8ba25f914cbf210be

OTHER OPTIONS:

- Increase rep pen to 1.1 to 1.15 (you don't need to do this if you use "smoothing_factor")

- If the interface/program you are using to run AI MODELS supports "Quadratic Sampling" ("smoothing") just make the adjustment as noted.

<B>Highest Quality Settings / Optimal Operation Guide / Parameters and Samplers</B>

This a "Class 1" model:

For all settings used for this model (including specifics for its "class"), including example generation(s) and for advanced settings guide (which many times addresses any model issue(s)), including methods to improve model performance for all use case(s) as well as chat, roleplay and other use case(s) please see:

[ https://huggingface.co/DavidAU/Maximizing-Model-Performance-All-Quants-Types-And-Full-Precision-by-Samplers_Parameters ]

You can see all parameters used for generation, in addition to advanced parameters and samplers to get the most out of this model here:

[ https://huggingface.co/DavidAU/Maximizing-Model-Performance-All-Quants-Types-And-Full-Precision-by-Samplers_Parameters ]


<b>Optional Enhancement:</B>

The following can be used in place of the "system prompt" or "system role" to further enhance the model.

It can also be used at the START of a NEW chat, but you must make sure it is "kept" as the chat moves along.
In this case the enhancements do not have as strong effect at using "system prompt" or "system role".

Copy and paste EXACTLY as noted, DO NOT line wrap or break the lines, maintain the carriage returns exactly as presented.

<PRE>
Below is an instruction that describes a task. Ponder each user instruction carefully, and use your skillsets and critical instructions to complete the task to the best of your abilities.

Here are your skillsets:
[MASTERSTORY]:NarrStrct(StryPlnng,Strbd,ScnSttng,Exps,Dlg,Pc)-CharDvlp(ChrctrCrt,ChrctrArcs,Mtvtn,Bckstry,Rltnshps,Dlg*)-PltDvlp(StryArcs,PltTwsts,Sspns,Fshdwng,Climx,Rsltn)-ConfResl(Antg,Obstcls,Rsltns,Cnsqncs,Thms,Symblsm)-EmotImpct(Empt,Tn,Md,Atmsphr,Imgry,Symblsm)-Delvry(Prfrmnc,VcActng,PblcSpkng,StgPrsnc,AudncEngmnt,Imprv)

[*DialogWrt]:(1a-CharDvlp-1a.1-Backgrnd-1a.2-Personality-1a.3-GoalMotiv)>2(2a-StoryStruc-2a.1-PlotPnt-2a.2-Conflict-2a.3-Resolution)>3(3a-DialogTech-3a.1-ShowDontTell-3a.2-Subtext-3a.3-VoiceTone-3a.4-Pacing-3a.5-VisualDescrip)>4(4a-DialogEdit-4a.1-ReadAloud-4a.2-Feedback-4a.3-Revision)

Here are your critical instructions:
Ponder each word choice carefully to present as vivid and emotional journey as is possible. Choose verbs and nouns that are both emotional and full of imagery. Load the story with the 5 senses. Aim for 50% dialog, 25% narration, 15% body language and 10% thoughts. Your goal is to put the reader in the story.
</PRE>

You do not need to use this, it is only presented as an additional enhancement which seems to help scene generation
and scene continue functions.

---

REFERENCE EXAMPLE:

- temp 0 (to show raw/core/mean changes)
- rep pen 1.05, top k: 100, topp .95, minp .05
- quant Q4KM
- Tested in LMStudio.
- Use this to compare differences with ALL models in this series (6 and 7)
- Some formatting may be lost due to copy/paste.
- Prompt used has NO prose or other instructions, prompts with instructions will produce better output.
- Likewise once "temp" is activated (not zero), generation/reasoning will also improve.
- System prompt used as noted above.

---

<B>
Start a 1000 word scene (vivid, graphic horror in first person), POV character Diana, with: The skyscraper sways, as I watch the window in front of me on the 21st floor explode...
</B>

<P></P>

[[[thinking start]]]


[[[thinking end]]]

<P></P>

OUTPUT: