Triangle104/MN-BackyardAI-Party-12B-v1-Q8_0-GGUF

This model was converted to GGUF format from Sao10K/MN-BackyardAI-Party-12B-v1 using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model.

Model Info:

Trained with compute from Backyard.ai | Thanks to them and @dynafire for helping me out.

Trained on 2x A100 SXM 40GB as an 8-bit LoRA.

This is a group-chat based roleplaying model, based off of 12B-Lyra-v4a2, a variant of Lyra-v4 that is currently private.

It is trained on an entirely human-based dataset, based on forum / internet group roleplaying styles. The only augmentation done with LLMs is to the character sheets, to fit to the system prompt, to fit various character sheets within context.

This model is still capable of 1 on 1 roleplay, though I recommend using ChatML when doing that instead.

Formatting:

Training for the multi-character roleplaying format is done with a variant of ChatML, replaced with [INST] blocks formatted as such. Use this to draw in more of the training done.

[INST]system System Prompt Here[/INST] [INST]user User's Yapping[/INST] [INST]model Model Reply[/INST]

Relevant!

Turns do not need to respect user -> model -> user. Training is done with disjointed turns that may have repeating turns to simulate real group roleplay / chat scenarios with multiple users.
Additional work may be required to fit for your front-end.
Ideally character cards are all included in the turns. Training is done with this in mind. Below on the page has relevant information.
This is a Nemo model, so lower Temperature and a sprinkling of min_p helps.
This does require a lot of tinkering to fit within SillyTavern / other frontends.

To get better performance on Regular 1 on 1 Roleplay or Chat scenarios, use ChatML to get more of Lyra's performance.

For best results, set both <|im_end|> and [INST] as stopping strings. Recommended Temperature is <1 , min_p of ateast 0.1 Dataset Information:

This dataset is made from a human RP forum source, trimmed down, augmented and reformatted to fit.

Each entry has a minimum of 6 turns to be inside
Number of unique/main characters are ranged from 2 to 7 characters per entry.
Each conversation is kept as is to preserve quality and uniqueness of the human data.
Only the added system prompt makes use of the current character sheets given.

The following below is how the current Character Card / Sheets is done, which are augmented from the messy and non-uniform character sheets available. To get best results, please reformat your current character data to the on as seen below, or as similar as you can if possible.

Character Name:
Age:
Race:
Mageblood Type: (if applicable)
Favored Magic Class: (if applicable)
Previous Magic Training: (if applicable)
Occupation/Profession: (if applicable)
Appearance: (if applicable)
Biography: (if applicable)
Good Attributes: (if applicable)
Bad Attributes: (if applicable)
Equipment: (if applicable)
Other Information: (if applicable)

Here is an example based on the above format:

Character Name: Keri Wolf
Age: 21
Race: Vampire
Mageblood Type: Hydromancy
Favored Magic Class: Aqua
Previous Magic Training: Novice
Occupation/Profession: None specified

Appearance:

Height: 5'9"
A wooden wolf necklace around her neck, contrasting with her pale skin
Three swords strapped to her waist
A tattoo of a thorn vine, her family crest, on her right arm
Normal eye color is red but changes based on her mood or the topic of conversation
Carries a hunk of wood and a carving knife for personal activities

Biography:
Keri Wolf grew up in a family of adopted siblings in Djarkel. She had a normal childhood, with her best friend Satori, and was taught basic self-defense by her father. Her brothers were considered troublemakers but remained close to her. On her 21st birthday, her family was slaughtered by a vampire nest, and she was bitten. This led to her developing vampiric traits and seeking answers at the college.

Good Attributes:

Easy-going
Observant
Helps those in trouble
Soft-hearted
Kind
Cool-headed
Good at getting out of difficult situations
Avoids violence
Gets along well with different people
Loves animals

Bad Attributes:

Sunlight sensitivity
Hatred towards vampires outside the college
Keeps feelings in check, leading to dangerous outbursts
Cruel manner of speaking
Thirst for revenge

Equipment:

Wooden wolf necklace
Three swords (one engraved with a rose, one engraved with her father's name, and one for decoration)
Carving knife
Hunk of wood
Stealth Ring
Knight's Shield

Other Information:

Secret word: rebirth

The following system prompt is augmented from available character sheets, or details from the original dataset. Placeholder names are given as shown.

You are involved in a multi-character internet-style roleplaying session with a human user, who is playing as Ballbuster Steve. Do not generate dialogue for the user's character, Ballbuster Steve. Focus on the other characters.

[Human User]
Ballbuster Steve # {user} Character Bio: [Steve's bio]

[Involved Characters] Altair "Arty" Enzo # {char1} Character Bio: [Arty's bio]

Sukuna Gojo # {char2} Character Bio: [Sukuna's bio]

The roleplay begins now.

This is how some of the turn example looks like, newlines are only for visual use.

[INST]user Ballbuster Steve: Being the doorman at a nightclub, especially one as popular as LUSH... [/INST]

[INST]model Altair "Arty" Enzo: While he was waiting for Jake to answer, Arty noticed from the corner of his eye... [/INST]

[INST]model Sukuna Gojo: Nick was now out of his element; he just came off his portable radio app... [/INST]

[INST]user Ballbuster Steve: Steve grabbed his black clutch from where it was stashed under the mixing desk... [/INST]

To make it easier, this is how I'd format responses for the backend:

[INST]system {system_prompt}[/INST] [INST]user {user}: {text}[/INST] [INST]model {char1}: {text}[/INST] [INST]model {char2}: {text}[/INST] [INST]user {user}: {text}[/INST] [INST]model {char1}: {text}[/INST]<|im_end|> # For Final Turn only. Alternatively, set <|im_end|> as a stopping string.

Current Issues:

Impersonation - This is a common side-effect of pure human roleplaying data, unfortunately. Users do like writing the actions of others, though this is more limited to end of reply.

Varied Output Quality - A swipe should be enough? I only removed obviously bad entries. Output quality varies thanks to the variety of human users involved.

Character Detail Confusion when in group chats This rarely happens, but it is usually when there are too many main characters, or the bio is improperly formatted and seperated. Or if you're using an additional, complex system prompt.

Random OOC / Story Break moments may still exist despite me filtering the data.

Limited Dataset Size -> 4K Varied Samples ranging from 2-7 characters per entry. I'm looking to expand.

Limited System Prompt? -> I'm trying to improve on this.

Fantasy-bias? -> Most of the entries are fantasy-based after all.

Training Metrics

n_sample: 4000 n_gpu: 2 global batch size: 12 lora: bnb_8bit no. epochs: 3 lr: 0.000004 lr_scheduler: cosine deepspeed: zero2

Use with llama.cpp

Install llama.cpp through brew (works on Mac and Linux)

brew install llama.cpp

Invoke the llama.cpp server or the CLI.

CLI:

llama-cli --hf-repo Triangle104/MN-BackyardAI-Party-12B-v1-Q8_0-GGUF --hf-file mn-backyardai-party-12b-v1-q8_0.gguf -p "The meaning to life and the universe is"

Server:

llama-server --hf-repo Triangle104/MN-BackyardAI-Party-12B-v1-Q8_0-GGUF --hf-file mn-backyardai-party-12b-v1-q8_0.gguf -c 2048

Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well.

Step 1: Clone llama.cpp from GitHub.

git clone https://github.com/ggerganov/llama.cpp

Step 2: Move into the llama.cpp folder and build it with LLAMA_CURL=1 flag along with other hardware-specific flags (for ex: LLAMA_CUDA=1 for Nvidia GPUs on Linux).

cd llama.cpp && LLAMA_CURL=1 make

Step 3: Run inference through the main binary.

./llama-cli --hf-repo Triangle104/MN-BackyardAI-Party-12B-v1-Q8_0-GGUF --hf-file mn-backyardai-party-12b-v1-q8_0.gguf -p "The meaning to life and the universe is"

or

./llama-server --hf-repo Triangle104/MN-BackyardAI-Party-12B-v1-Q8_0-GGUF --hf-file mn-backyardai-party-12b-v1-q8_0.gguf -c 2048

Triangle104
/

MN-BackyardAI-Party-12B-v1-Q8_0-GGUF

Triangle104/MN-BackyardAI-Party-12B-v1-Q8_0-GGUF

Model Info:

[Involved Characters] Altair "Arty" Enzo # {char1} Character Bio: [Arty's bio]

Sukuna Gojo # {char2} Character Bio: [Sukuna's bio]

Use with llama.cpp

CLI:

Server:

Model tree for Triangle104/MN-BackyardAI-Party-12B-v1-Q8_0-GGUF

Collection including Triangle104/MN-BackyardAI-Party-12B-v1-Q8_0-GGUF

RP