Good story telling models that can fit in an RTX 3060 12GB. Updated July 2025. # Models - **Current favorite**: [nbeerbower/Lyra4-Gutenberg-12B](https://huggingface.co/nbeerbower/Lyra4-Gutenberg-12B) - [Sao10K/MN-12B-Lyra-v4](https://huggingface.co/Sao10K/MN-12B-Lyra-v4) - [mistralai/Mistral-Small-3.1-24B-Instruct-2503](https://huggingface.co/mistralai/Mistral-Small-3.1-24B-Instruct-2503) - [MarinaraSpaghetti/NemoMix-Unleashed-12B](https://huggingface.co/MarinaraSpaghetti/NemoMix-Unleashed-12B) - [Sao10K/Fimbulvetr-11B-v2](https://huggingface.co/Sao10K/Fimbulvetr-11B-v2) - [Sao10K/L3-8B-Stheno-v3.2](https://huggingface.co/Sao10K/L3-8B-Stheno-v3.2) - [Mistral-Nemo-Instruct-2407](https://huggingface.co/mistralai/Mistral-Nemo-Instruct-2407) (12B) - [PocketDoc/Dans-PersonalityEngine-V1.2.0-24b](https://huggingface.co/PocketDoc/Dans-PersonalityEngine-V1.2.0-24b) # Creators - Whitelist: Sao, nbeerbower - Blacklist: DavidAU, SicariusSicariiStuff - Greylst: The Drummer # Remarks - Roleplay and storywriting are distinct tasks! Some models excel at one and fail at the other. - Dont waste time on sampler settings; use recommended and optimize the prompt - Don't "overparameterize" by writing too long a prompt - Don't underestimate the original instruct models, esp. from mistral - Don't underestimate classical procedural generation! # Links - [llama.cpp](https://github.com/ggerganov/llama.cpp) and [llama-cpp-python](https://github.com/abetlen/llama-cpp-python) - **preferred LLM software** - [/r/localllama](https://www.reddit.com/r/LocalLLaMA/) - /lmg/ - [LMSys Chatbot Arena Leaderboard](https://huggingface.co/spaces/lmsys/chatbot-arena-leaderboard) - [Uncensored General Intelligence Leaderboard](https://huggingface.co/spaces/DontPlanToEnd/UGI-Leaderboard) - [/r/SillyTavernAI](https://www.reddit.com/r/SillyTavernAI/) - NothingiisReal discord - NeverSleep discord - SillyTavern discord - BeaverAI discord