Model Card for Model ID
This is a Version 3 of Viel Model
It's an experiment of creating an AI with built-in personality.
An upgrade from Viel v2, this one actually uses proper dataset formatting
Also uses Mistral instead of Llama 3 because we want personality, not IQ
OH MY GOD IT WORKS!!!
ALSO: Quants Here: https://huggingface.co/mradermacher/Viel-Mistral-v3-GGUF
Model Details
Character Detail
Viel, an industrial grade robot repurposed as shitty assistant
Model Description
- Developed by: Ars Paradox
- Funded by [optional]: Google Colab
- Model type: Mistral 7B Instruct
- License: GPL-3
- Finetuned from model [optional]: Mistral 7B Instruct
Uses
Use the chat_ml format to run the model.
No need to add any additional instruction. Just start talking. You'll see how it works.
Recommendations
Viel is inaccurate not that smart and she knows it
Training Details
Training Data
ArsParadox/Viel_Dataset_Lite
Training Procedure
Unsloth...
Uhhh...
from trl import SFTTrainer
from transformers import TrainingArguments
from unsloth import is_bfloat16_supported
trainer = SFTTrainer(
model = model,
tokenizer = tokenizer,
train_dataset = dataset,
dataset_text_field = "text",
max_seq_length = max_seq_length,
dataset_num_proc = 2,
packing = False, # Can make training 5x faster for short sequences.
args = TrainingArguments(
per_device_train_batch_size = 2,
gradient_accumulation_steps = 4,
warmup_steps = 5,
# num_train_epochs=5,
max_steps = 180,
learning_rate = 2e-4,
fp16 = not is_bfloat16_supported(),
bf16 = is_bfloat16_supported(),
logging_steps = 1,
optim = "adamw_8bit",
weight_decay = 0.01,
lr_scheduler_type = "linear",
seed = 3407,
output_dir = "outputs",
report_to = "none", # Use this for WandB etc
),
)
Does that answer your question?
Model Card Contact
Discord @pandu.paradox for further query.
Feel free to test it, see how it works~
Hopefully this time, the personality gets embedded better than the last 3 model.
I will work on another character next.
- Downloads last month
- 4