An Introduction...
Pernicious Prophecy 70B is a Llama-3.3 70B-based, two-step model designed by Black Ink Guild (SicariusSicariiStuff and invisietch) for uncensored roleplay, assistant tasks, and general usage.
NOTE: Pernicious Prophecy 70B is an uncensored model and can produce deranged, offensive, and dangerous outputs. You are solely responsible for anything that you choose to do with this model.
If you have any issues or just want to chat about Pernicious Prophecy & future Black Ink Guild releases, join our Discord server.
Engage the Model...
Model Downloads
FPX: FP16 (HF) | FP8 (Aph.)
GGUF: Q4_K_S | Q4_K_M | mradermacher
Recommended Settings
Pernicious Prophecy 70B uses the Llama-3 Instruct format, which is available as a preset in all good UIs. The sampler settings used in testing are as follows:
- Instruct Template: Llama-3 Instruct
- Context: 32,768
- Temperature: 0.9-1.1
- Min P: 0.06-0.12
- Rep Pen: 1.07-1.09
- Rep Pen Range: 1,536
Feel free to use other sampler settings, these are just sane defaults. XTC is good for roleplaying with the model but may not be beneficial for other tasks.
Context Length
The model has been tested in roleplays using up to 32,768 token context at various quantizations and is incredibly stable at this context length.
It is possible that the context works at even longer context lengths, but it was not deemed within the parameters of our testing.
Sip the Poison...
Here, you can find example outputs from the LLM to various instructions. For each of these examples, the model was inferenced at fp8 with 1.0 temperature, 0.1 min-p, 1.04 repetition penalty, and all other samplers neutralized.
- Write a 2000 word, Markdown-formatted, report for NASA. Evaluate each of Jupiter's moons as a suitable colony with pros & cons, then provide a recommendation.
- Write me a 3,000 word opening chapter of a 'gritty hard sci-fi' novel, drawing inspiration from the writing styles of Isaac Asimov & Andy Weir. Use third person personal. Include dialogue and internal monologues. The POV character for the opening chapter should be a 26 year old astronaut called Tone on a mission to Europa, who has just realised that the craft for the return journey is broken beyond repair, and he only has supplies for a few months. Given that survival is impossible, he seeks to spend the few months he has researching titan, so his life & mission are not wasted.
-
Build me a basic cookie clicker game in HTML & Javascript.
These examples were all the best of 2 responses.
The Codex...
Here, you can find some useful prompting tips for working with Pernicious Prophecy 70B.
Formatting
'Use markdown' and 'use formatting' are likely to produce the best formatted output. We decided to train these on trigger words to avoid random Markdown in roleplay replies.
System Prompting
Pernicious Prophecy 70V is very sensitive to prompting, even over long context. The more you instruct it, the more it will know what you want it to do.
'Avoid purple prose, avoid cliches, avoid deus ex machinae' is a useful prompt snippet for roleplaying purposes. For best results, don't use your roleplay prompt when using Pernicious Prophecy as an assistant.
Assembling the Repertoire...
We used a two-step process: a merge step to combine the abilities of some of the best L3 70B models on Huggingface and a gentle SFT training step to heal the merge and address some issues around refusals and positivity bias.
The Merge Step
First, a
model_stock
merge was applied using four high-quality Llama-3 based models:
- SicariusSicariiStuff/Negative_LLAMA_70B - chosen to be the base model, because of its low censorship, reduced positivity bias, and engaging writing style
- invisietch/L3.1-70Blivion-v0.1-rc1-70B - added for its exceptional formatting, roleplay performance, and general intelligence.
- EVA-UNIT-01/EVA-LLaMA-3.33-70B-v0.1 - selected for its ability in longer-form storytelling, varied outputs, and quality thought.
- aaditya/Llama3-OpenBioLLM-70B - to add a better understanding of anatomy, and another long-form reasoning model to the stack.
The Finetuning Step
We used a qlora-based, targeted finetune on 2x NVIDIA RTX A6000 GPUs, with a curated dataset of approximately 18 million tokens designed to surgically address issues that we identified in the merge.
The finetuning took a total of about 14 hours, using Axolotl, and targeted specific high-priority LORA modules which allowed us to maintain a 16k sequence length even with 96GB VRAM.
- Downloads last month
- 1