--- datasets: - NewEden/KTO-IF-Dans - NewEden/Opus-accepted-hermes-rejected-shuffled - NewEden/KTO-Instruct-Mix - NewEden/Purpura-Arkhaios-CC-KTO base_model: - Delta-Vector/Rei-V3-KTO-12B base_model_relation: quantized quantized_by: ArtusDev tags: - exl2 - roleplay - storywriting - axolotl - text-generation-inference - finetune ---
Another prototype Magnum... (This time with RL!)
Taking the previous 12B trained with Subseqence Loss - This model is meant to refine the base's sharp edges and increase coherency, intelligence and prose while replicating the prose of the Claude models Opus and Sonnet
Fine-tuned on top of Rei-V3-12B-Base, Rei-12B is designed to replicate the prose quality of Claude 3 models, particularly Sonnet and Opus, using a prototype Magnum V5 datamix.
Rei-12B uses the ChatML format. A typical conversation should be structured as:
<|im_start|>user
Hi there!<|im_end|>
<|im_start|>assistant
Nice to meet you!<|im_end|>
<|im_start|>user
Can I ask a question?<|im_end|>
<|im_start|>assistant
Currently, your role is {{char}}, described in detail below. As {{char}}, continue the narrative exchange with {{user}}.\n\n
https://wandb.ai/new-eden/KTO/artifacts/axolotl-config/config-eyt7d5i9/v0/files/axolotl_config_jvjuci1x.yml
The model was trained for 1 epochs on 8x NVIDIA H100s GPUs generously provided by @Kalomaze
I'd like to thank, Ruka/Sama twinkman | LucyKnada | Kubernetes Bad | PocketDoc | Tav | Trappu | Alicat | And the rest of Anthracite/Pygmalion for testing, feedback, and support.