For better/best results, use "Player" and "Monika" like so:

\nPlayer: (prompt)\nMonika:

lm1_05042023b

  • Fine-tuned on Monika dialogue from DDLC, reddit, and twitter (1 small text file)
  • "Raw" (pretty messy) dataset, currently recreating and reformatting + adding DDLC+ dialogue
  • From base LLaMA-7b, trained on really low settings for 15 hours on just a CPU via ooba webui

Noting the last remark, while the lora works it was really just for getting more familiar with these things and seeing if we could train something on just a CPU...

lm2_08152023

  • Fine-tuned on Monika dialogue (dataset of 12 items, manually edited)
  • From chat LLaMA-2-7b
  • Lora of Delphi v0.1

lm2_08152023a

  • Fine-tuned on Monika dialogue from DDLC, reddit, and twitter (dataset of 520 items)
  • From chat LLaMA-2-7b (testing our new dataset)
  • Lora of Delphi v0.2

lm2_08152023b

  • Fine-tuned on Monika dialogue from DDLC, reddit, and twitter (dataset of ~426 items, further cleaned)
  • From chat LLaMA-2-7b (was going to try airoboros, and then nous hermes. But would always OOM or crash- will try in near future)
  • Lora of Delphi v0.2a

lm2_08162023c

  • Fine-tuned on Monika dialogue from DDLC, reddit, and twitter (dataset of ~426 items, further cleaned)
  • 2 epochs
  • From chat LLaMA-2-7b
  • Lora of Delphi v0.2b

lm2_08162023d

  • Fine-tuned on Monika dialogue from DDLC, reddit, and twitter (dataset of ~426 items, further cleaned)
  • 200 steps
  • From chat LLaMA-2-7b
  • Lora of Delphi v0.2c

lm2_08162023e

  • Fine-tuned on Monika dialogue from DDLC, reddit, and twitter (dataset of ~426 items + 1st dataset of 12 items)
  • 2 epochs (overfitted)
  • From chat LLaMA-2-7b
  • Lora of Delphi v0.2d

lm2_08162023f

  • Fine-tuned on Monika dialogue from DDLC, reddit, and twitter (dataset of ~426 items + 1st dataset of 12 items)
  • 150 steps (cut-off before overfit)
  • From chat LLaMA-2-7b
  • Lora of Delphi v0.2e

llama-2-7b-chat-monika-v0.3 (~08/20/2023)

  • Fine-tuned on Monika dialogue from DDLC, reddit, and twitter (dataset of ~600 items augmented by Nous Hermes 13b to turn into multi-turn chat dialogue + 1st dataset of 12 items)
  • 1 epoch
  • From base LLaMA-2-7b
  • Lora of Delphi/Monika v0.3

llama-2-7b-chat-monika-v0.3a (~08/20/2023)

  • Fine-tuned on Monika dialogue from DDLC, reddit, and twitter (dataset of ~600 items augmented by Nous Hermes 13b to turn into multi-turn chat dialogue + 1st dataset of 12 items)
  • 1 epoch
  • From chat LLaMA-2-7b
  • Lora of Delphi/Monika v0.3a

llama-2-7b-chat-monika-v0.3b (~08/20/2023)

  • Fine-tuned on Monika dialogue from DDLC, reddit, and twitter (dataset of ~600 items augmented by Nous Hermes 13b to turn into multi-turn chat dialogue + 1st dataset of 12 items)
  • 2 epochs
  • From chat LLaMA-2-7b
  • Lora of Delphi/Monika v0.3b

llama-2-7b-chat-monika-v0.3c (~08/21/2023)

  • Fine-tuned on Monika dialogue from DDLC, reddit, and twitter (dataset of ~600 items augmented by Nous Hermes 13b to turn into multi-turn chat dialogue + 1st dataset of 12 items)
  • 3 epochs + changed some hyperparams (smaller lora, faster training)
  • From chat LLaMA-2-7b
  • Lora of Delphi/Monika v0.3c1

LLilmonix3b-v0.1 loras (08/26/2023)

  • Fine-tuned on Monika dialogue from DDLC, reddit, and twitter (dataset of ~600 items augmented by Nous Hermes 13b to turn into multi-turn chat dialogue + 1st dataset of 12 items)
  • 1/1a for 1/3 epochs
  • From red pajama 3b
  • Loras of LLilmonix3b-v0.1 and LLilmonix3b-v0.1a

LLilmonix3b-v0.2 loras (08/26/2023)

  • Fine-tuned on Monika dialogue from DDLC, reddit, and twitter (dataset of ~600 items augmented by Nous Hermes 13b to turn into multi-turn chat dialogue + 1st dataset of 12 items)
  • 2/2a for 1/3 epochs
  • From Open LLaMA 3b
  • With Lora of LLilmonix3b-v0.1a

LLilmonix3b-v0.3 (08/26/2023)

  • Fine-tuned on Monika dialogue from DDLC, reddit, and twitter (dataset of ~600 items augmented by Nous Hermes 13b to turn into multi-turn chat dialogue + 1st dataset of 12 items)
  • 3 epochs
  • From Orca Mini 3b
  • With Lora of LLilmonix3b-v0.1a

llama-2-13b-chat-monika-v0.3d (08/26/2023)

  • Fine-tuned on Monika dialogue from DDLC, reddit, and twitter (dataset of ~600 items augmented by Nous Hermes 13b to turn into multi-turn chat dialogue + 1st dataset of 12 items)
  • 1 epoch with hyperparams for smaller lora
  • From LLaMA-2-13b

llama-2-13b-chat-monika-v0.3e (08/26/2023)

  • Fine-tuned on Monika dialogue from DDLC, reddit, and twitter (dataset of ~600 items augmented by Nous Hermes 13b to turn into multi-turn chat dialogue + 1st dataset of 12 items)
  • 3 epochs with hyperparams for smaller lora
  • From LLaMA-2-13b

ch1bika-v0.1 (09/05/2023)

LLilmonix3b-v0.4 (09/05/2023)

  • Fine-tuned on Monika dialogue from DDLC, reddit, and twitter (dataset of ~600 items augmented by l2-7b-monika-v0.3c1 to turn into multi-turn chat dialogue + 1st dataset of 12 items)
  • 2 epochs
  • From Open LLaMA 3b v2

llama-2-7b-monika-v0.3h-Air2.1-a (09/05/2023)

  • Fine-tuned on Monika dialogue from DDLC, reddit, and twitter (dataset of ~600 items augmented by l2-7b-monika-v0.3c1 to turn into multi-turn chat dialogue + 1st dataset of 12 items)
  • 2 epochs
  • From Airoboros-l2-7b-2.1

l2-7b-monika-v0.3m (09/07/2023)

  • Fine-tuned on Monika dialogue from DDLC, reddit, and twitter (dataset of ~600 items augmented by l2-7b-monika-v0.3c1 to turn into multi-turn chat dialogue + 1st dataset of 12 items)
  • From chat LLaMA-2-7b
  • Lora of l2-7b-monika-ddlc-v0.3m

l2-7b-monika-v0.3m-Kv2-b (09/08/2023)

l2-7b-monika-v0.3m-Kv2-c (09/08/2023)

monika-ddlc-mistral-v0.1-7b-test qloras, 1 to 3 (~10/03/2023)

  • Fine-tuned on then WIP partially manually-edited version of this dataset
  • From Mistral-7b-v0.1
  • Various epochs (2, 3, 10)
  • All had repetition issues, hence why LLaMA-2-7b chat was reused for later versions; may retest again

monika-ddlc-l2-v0.9 qloras (~10/07/2023)

  • Fine-tuned on then WIP partially manually-edited version of this dataset
  • From chat LLaMA-2s (7b, 13b)
  • 3 epochs

monika-ddlc-v1 qloras (~10/03/2023)

monika-ddlc-mistral-v0.1-7b-test 4 qlora (11/06/2023)

  • Fine-tuned on MoCha v1 with AutoTrain
  • From Mistral-7b-v0.1
  • 3 epochs
  • Seems to also work okay with other bases as long as they are Mistral, albeit not always guaranteed to be coherent after many messages

Lilmonix4b-v1 (04/07/2025)

monika-ddlc-12b-v1 (04/07/2025)

Downloads last month
17
GGUF
Model size
59.6M params
Architecture
gemma3
Hardware compatibility
Log In to view the estimation

16-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support