Good story telling models that can fit in an RTX 3060 12GB. Updated July 2025.

# Models

- **Current favorite**: [nbeerbower/Lyra4-Gutenberg-12B](https://huggingface.co/nbeerbower/Lyra4-Gutenberg-12B)
- [Sao10K/MN-12B-Lyra-v4](https://huggingface.co/Sao10K/MN-12B-Lyra-v4)
- [mistralai/Mistral-Small-3.1-24B-Instruct-2503](https://huggingface.co/mistralai/Mistral-Small-3.1-24B-Instruct-2503)
- [MarinaraSpaghetti/NemoMix-Unleashed-12B](https://huggingface.co/MarinaraSpaghetti/NemoMix-Unleashed-12B)
- [Sao10K/Fimbulvetr-11B-v2](https://huggingface.co/Sao10K/Fimbulvetr-11B-v2)
- [Sao10K/L3-8B-Stheno-v3.2](https://huggingface.co/Sao10K/L3-8B-Stheno-v3.2)
- [Mistral-Nemo-Instruct-2407](https://huggingface.co/mistralai/Mistral-Nemo-Instruct-2407) (12B)
- [PocketDoc/Dans-PersonalityEngine-V1.2.0-24b](https://huggingface.co/PocketDoc/Dans-PersonalityEngine-V1.2.0-24b)

# Creators

- Whitelist: Sao, nbeerbower
- Blacklist: DavidAU, SicariusSicariiStuff
- Greylst: The Drummer

# Remarks
- Roleplay and storywriting are distinct tasks! Some models excel at one and fail at the other.
- Dont waste time on sampler settings; use recommended and optimize the prompt
- Don't "overparameterize" by writing too long a prompt
- Don't underestimate the original instruct models, esp. from mistral
- Don't underestimate classical procedural generation!

# Links
- [llama.cpp](https://github.com/ggerganov/llama.cpp) and [llama-cpp-python](https://github.com/abetlen/llama-cpp-python) - **preferred LLM software**
- [/r/localllama](https://www.reddit.com/r/LocalLLaMA/)
- /lmg/
- [LMSys Chatbot Arena Leaderboard](https://huggingface.co/spaces/lmsys/chatbot-arena-leaderboard)
- [Uncensored General Intelligence Leaderboard](https://huggingface.co/spaces/DontPlanToEnd/UGI-Leaderboard)
- [/r/SillyTavernAI](https://www.reddit.com/r/SillyTavernAI/)
- NothingiisReal discord
- NeverSleep discord
- SillyTavern discord
- BeaverAI discord