L3.3-oiiaioiiai-B

image/png

...I should explain what the 2 oiiaioiiai models are. And why I'm keeping this.
The inital idea for oiiaioiiai is the giant "final"/"farewell" model for me on L3.3. Basically, take all the best and greatest as of this month, 04/25. and rock smashing into a huge model.

The server I use to merge models only has around 200GB of ram. Initally I tried doing the whole suite of ~28 odd so models... But as you can already tell, this would have crashed the VM. So instead, I split it into 2 models.

What currently A is actually the second iteration of the A part. Why?
Well if anyone have seen KaraKaraWarehouse/L3.3-Joubutsu2000... That model is a living example of why it's risky merging in so many models. Even at low temperatures, it still has no concept of a "sentence".

Hence the title was renamed from "A" to "Joubutsu".
(If you really want to learn what it is, I highly suggest looking up on dic.nicovideo.jp)

After testing "Joubutsu" (formerly oiiai-A), I decided to change things up a bit for B.
dare_ties became ties only. Since I suspected the poor model became... like that... because of the merge method was different.

Then I tested B, and it was... usable. At the time, I'm also was looking for a model to do MTL with.
I'm kinda blown away by how much I like it when it comes to doing Japanese to English translations. Which I felt my previous models didn't do super well.
(Perhaps... It could be due to Bigger body or Asobi or any other model, but your guess is good as mine...)

More so, this model also is one of the first models I managed to merge in AISG's SEA Lion models. (#SupportLocal and I do geniunely mean that)
So the theory should hold that it should be able to do Burmese, Chinese, English, Filipino, Indonesia, Javanese, Khmer, Lao, Malay, Sundanese, Tamil, Thai, Vietnamese better than it's counterparts.

Anyway, A2 (The iteration that is on HF and labeled as A) faired a bit better but still write like a schizopheric after some quick testing.

Model Vibes

I'm writing vibes from my testing. Trying to be a bit more transparent on some of my models since I'm sure some that use on featherless would appreciate the added transparency.
Plus, it's aspirations to new model mergers on what to look out for their own merges.
For my usage I actually do not RP. I love to theory-craft and write scenarios and game events (possibly due to my background but I digress). So take it from that POV:

  1. Tends to write wikipedia like content when asking for idea crafting. It doesn't like giving point forms.
  2. Japanese -> English Translation (YMMV), is better than MagicalGirl series and is the reason why I'm keeping this model around.
  3. Has a bit of a "Weeb" feel to it(?). It tends to stretch out groans and moans like you might see in a Visual Novel.
  4. This model like to be more "direct" and pays more attention to the prompt.
  5. It is able to do dialogue generation (Likely due to wayfarer playing a part in it). But I prefer CursedMagicalGirl or MagicalGirl-2 over this.
  6. Model should be able to do furry elements well, but I haven't tested it.
  7. Base model should make it uncensored. But I think it might be a tad too horny.

Prompting

ChatML seems to work which is what I typically use.

My standard sampler settings are as follows (finally and yes, it's more or less standarized)

Temperature: 1.4/1.2
Min P: 0.03

How 2 Model Merge

I'm considering writing an article for this so I'm jotting my notes down in public.

On Selection...

In general and "Fun fact". I do not have a rhyme or reason for merging. The reason why I call model merging "Rock smashing" is because it's like one sometimes. You bang 2 different rocks and you get another different rock. You could do all the fancy science nonsense but I personally not a fan of those.

Though I think there should be a few points to be made:

  1. You should probably pre-test select models and read their model description.
  2. Benchmarks can only get you that far before you realize that "there is no perfect benchmark in the world, and any and all benchmarks can be cheated."

I'll add more when I can think of them.

On Testing models

Due to 2, there is also no standard way to test models. Of course you should be able to tell if the model is behaving like a schizo.

  1. Test with what you have. Your own chat logs is a useful way to tell what is good and what is bad
  2. Write down what you observe compared to other models. (See section on model vibes above). In a way, they help inform you what your next model can do better or you want out of a model. (i.e. I'm currently considering merging magicalgirl with this model and see if I can get the best out of both worlds)

Merge Details

Merge Method

This model was merged using the TIES merge method using ReadyArt/The-Omega-Directive-L-70B-v1.0 as a base.

Models Merged

The following models were included in the merge:

Configuration

The following YAML configuration was used to produce this model:

##############################################################################
# The benefit of L3 models is that all subversions are mergable in some way.
# So we can create something **REALLY REALLY REALLY** Stupid like this.
##############################################################################
# PLEASE DO NOT FOLLOW.
# This will probably show up on the hf repo. Hi there!
##############################################################################
# - KaraKaraWitch.
# P.S. 3e7aWKeGHFE (15/04/25)
##############################################################################

models:
  - model: Black-Ink-Guild/Pernicious_Prophecy_70B
    parameters:
      density: 0.8129
      weight: 0.3378
  # De-alignment
  - model: PKU-Alignment/alpaca-70b-reproduced-llama-3
    parameters:
      density: 0.7909
      weight: 0.672
  # Text Adventure
  - model: LatitudeGames/Wayfarer-Large-70B-Llama-3.3
    parameters:
      density: 0.5435
      weight: 0.7619
  - model: KaraKaraWitch/Llama-3.3-Amakuro
    parameters:
      density: 0.37
      weight: 0.359
  - model: ReadyArt/Forgotten-Safeword-70B-v5.0
    parameters:
      density: 0.37
      weight: 0.359
  - model: Undi95/Sushi-v1.4
    parameters:
      density: 0.623
      weight: 0.789
  - model: sophosympatheia/Nova-Tempus-70B-v0.1
    parameters:
      density: 0.344
      weight: 0.6382
  - model: flammenai/Mahou-1.5-llama3.1-70B
    parameters:
      density: 0.56490
      weight: 0.4597
  # Changelog: [ADDED] furries.
  - model: Mawdistical/Draconic-Tease-70B
    parameters:
      density: 0.4706
      weight: 0.3697
  # R1 causes a lot of alignment. So we avoid it.
  - model: Steelskull/L3.3-Electra-R1-70b
    parameters:
      density: 0.1692
      weight: 0.1692
  # Blue hair, blue tie... Hiding in your wiifii
  # - model: sophosympatheia/Midnight-Miqu-70B-v1.0
  #   parameters:
  #     density: 0.4706
  #     weight: 0.3697
  # OpenBioLLM does not use safetensors in the repo. Custom safetensors version.
  - model: OpenBioLLM
    parameters:
      density: 0.267
      weight: 0.1817
  - model: allura-org/Bigger-Body-70b
    parameters:
      density: 0.6751
      weight: 0.3722
  - model: nbeerbower/Llama3-Asobi-70B
    parameters:
      density: 0.7113
      weight: 0.4706
  # ...Reminds that anytime that I try and merge in SEALION models
  # it ends up overpowering other models. So I'm setting it *really* low.
  - model: aisingapore/Llama-SEA-LION-v3-70B-IT
    parameters:
      density: 0.0527
      weight: 0.1193


merge_method: ties
base_model: ReadyArt/The-Omega-Directive-L-70B-v1.0
parameters:
  select_topk: 0.50
dtype: bfloat16
Downloads last month
37
Safetensors
Model size
70.6B params
Tensor type
BF16
·
Inference Providers NEW
Input a message to start chatting with KaraKaraWitch/oiiaioiiai-B.

Model tree for KaraKaraWitch/oiiaioiiai-B

Collection including KaraKaraWitch/oiiaioiiai-B