L3.3-oiiaioiiai-B
...I should explain what the 2 oiiaioiiai models are. And why I'm keeping this.
The inital idea for oiiaioiiai is the giant "final"/"farewell" model for me on L3.3. Basically, take all the best and greatest as of this month, 04/25. and rock smashing into a huge model.
The server I use to merge models only has around 200GB of ram. Initally I tried doing the whole suite of ~28 odd so models... But as you can already tell, this would have crashed the VM. So instead, I split it into 2 models.
What currently A is actually the second iteration of the A part. Why?
Well if anyone have seen KaraKaraWarehouse/L3.3-Joubutsu2000... That model is a living example of why it's risky merging in so many models. Even at low temperatures, it still has no concept of a "sentence".
Hence the title was renamed from "A" to "Joubutsu".
(If you really want to learn what it is, I highly suggest looking up on dic.nicovideo.jp)
After testing "Joubutsu" (formerly oiiai-A), I decided to change things up a bit for B.dare_ties
became ties
only. Since I suspected the poor model became... like that... because of the merge method was different.
Then I tested B, and it was... usable. At the time, I'm also was looking for a model to do MTL with.
I'm kinda blown away by how much I like it when it comes to doing Japanese to English translations. Which I felt my previous models didn't do super well.
(Perhaps... It could be due to Bigger body or Asobi or any other model, but your guess is good as mine...)
More so, this model also is one of the first models I managed to merge in AISG's SEA Lion models. (#SupportLocal and I do geniunely mean that)
So the theory should hold that it should be able to do Burmese, Chinese, English, Filipino, Indonesia, Javanese, Khmer, Lao, Malay, Sundanese, Tamil, Thai, Vietnamese
better than it's counterparts.
Anyway, A2 (The iteration that is on HF and labeled as A) faired a bit better but still write like a schizopheric after some quick testing.
Model Vibes
I'm writing vibes from my testing. Trying to be a bit more transparent on some of my models since I'm sure some that use on featherless would appreciate the added transparency.
Plus, it's aspirations to new model mergers on what to look out for their own merges.
For my usage I actually do not RP. I love to theory-craft and write scenarios and game events (possibly due to my background but I digress). So take it from that POV:
- Tends to write wikipedia like content when asking for idea crafting. It doesn't like giving point forms.
- Japanese -> English Translation (YMMV), is better than MagicalGirl series and is the reason why I'm keeping this model around.
- Has a bit of a "Weeb" feel to it(?). It tends to stretch out groans and moans like you might see in a Visual Novel.
- This model like to be more "direct" and pays more attention to the prompt.
- It is able to do dialogue generation (Likely due to wayfarer playing a part in it). But I prefer CursedMagicalGirl or MagicalGirl-2 over this.
- Model should be able to do furry elements well, but I haven't tested it.
- Base model should make it uncensored. But I think it might be a tad too horny.
Prompting
ChatML seems to work which is what I typically use.
My standard sampler settings are as follows (finally and yes, it's more or less standarized)
Temperature: 1.4/1.2
Min P: 0.03
How 2 Model Merge
I'm considering writing an article for this so I'm jotting my notes down in public.
On Selection...
In general and "Fun fact". I do not have a rhyme or reason for merging. The reason why I call model merging "Rock smashing" is because it's like one sometimes. You bang 2 different rocks and you get another different rock. You could do all the fancy science nonsense but I personally not a fan of those.
Though I think there should be a few points to be made:
- You should probably pre-test select models and read their model description.
- Benchmarks can only get you that far before you realize that "there is no perfect benchmark in the world, and any and all benchmarks can be cheated."
I'll add more when I can think of them.
On Testing models
Due to 2, there is also no standard way to test models. Of course you should be able to tell if the model is behaving like a schizo.
- Test with what you have. Your own chat logs is a useful way to tell what is good and what is bad
- Write down what you observe compared to other models. (See section on model vibes above). In a way, they help inform you what your next model can do better or you want out of a model. (i.e. I'm currently considering merging magicalgirl with this model and see if I can get the best out of both worlds)
Merge Details
Merge Method
This model was merged using the TIES merge method using ReadyArt/The-Omega-Directive-L-70B-v1.0 as a base.
Models Merged
The following models were included in the merge:
- nbeerbower/Llama3-Asobi-70B
- aisingapore/Llama-SEA-LION-v3-70B-IT
- Mawdistical/Draconic-Tease-70B
- allura-org/Bigger-Body-70b
- sophosympatheia/Nova-Tempus-70B-v0.1
- Steelskull/L3.3-Electra-R1-70b
- PKU-Alignment/alpaca-70b-reproduced-llama-3
- OpenBioLLM
- Undi95/Sushi-v1.4
- ReadyArt/Forgotten-Safeword-70B-v5.0
- KaraKaraWitch/Llama-3.3-Amakuro
- flammenai/Mahou-1.5-llama3.1-70B
- Black-Ink-Guild/Pernicious_Prophecy_70B
- LatitudeGames/Wayfarer-Large-70B-Llama-3.3
Configuration
The following YAML configuration was used to produce this model:
##############################################################################
# The benefit of L3 models is that all subversions are mergable in some way.
# So we can create something **REALLY REALLY REALLY** Stupid like this.
##############################################################################
# PLEASE DO NOT FOLLOW.
# This will probably show up on the hf repo. Hi there!
##############################################################################
# - KaraKaraWitch.
# P.S. 3e7aWKeGHFE (15/04/25)
##############################################################################
models:
- model: Black-Ink-Guild/Pernicious_Prophecy_70B
parameters:
density: 0.8129
weight: 0.3378
# De-alignment
- model: PKU-Alignment/alpaca-70b-reproduced-llama-3
parameters:
density: 0.7909
weight: 0.672
# Text Adventure
- model: LatitudeGames/Wayfarer-Large-70B-Llama-3.3
parameters:
density: 0.5435
weight: 0.7619
- model: KaraKaraWitch/Llama-3.3-Amakuro
parameters:
density: 0.37
weight: 0.359
- model: ReadyArt/Forgotten-Safeword-70B-v5.0
parameters:
density: 0.37
weight: 0.359
- model: Undi95/Sushi-v1.4
parameters:
density: 0.623
weight: 0.789
- model: sophosympatheia/Nova-Tempus-70B-v0.1
parameters:
density: 0.344
weight: 0.6382
- model: flammenai/Mahou-1.5-llama3.1-70B
parameters:
density: 0.56490
weight: 0.4597
# Changelog: [ADDED] furries.
- model: Mawdistical/Draconic-Tease-70B
parameters:
density: 0.4706
weight: 0.3697
# R1 causes a lot of alignment. So we avoid it.
- model: Steelskull/L3.3-Electra-R1-70b
parameters:
density: 0.1692
weight: 0.1692
# Blue hair, blue tie... Hiding in your wiifii
# - model: sophosympatheia/Midnight-Miqu-70B-v1.0
# parameters:
# density: 0.4706
# weight: 0.3697
# OpenBioLLM does not use safetensors in the repo. Custom safetensors version.
- model: OpenBioLLM
parameters:
density: 0.267
weight: 0.1817
- model: allura-org/Bigger-Body-70b
parameters:
density: 0.6751
weight: 0.3722
- model: nbeerbower/Llama3-Asobi-70B
parameters:
density: 0.7113
weight: 0.4706
# ...Reminds that anytime that I try and merge in SEALION models
# it ends up overpowering other models. So I'm setting it *really* low.
- model: aisingapore/Llama-SEA-LION-v3-70B-IT
parameters:
density: 0.0527
weight: 0.1193
merge_method: ties
base_model: ReadyArt/The-Omega-Directive-L-70B-v1.0
parameters:
select_topk: 0.50
dtype: bfloat16
- Downloads last month
- 37