Umbra-v3-MoE-4x11b Data Card

--- license: apache-2.0 tags: - moe - frankenmoe - merge - mergekit - Himitsui/Kaiju-11B - Sao10K/Fimbulvetr-11B-v2 - decapoda-research/Antares-11b-v2 - beberik/Nyxene-v3-11B base_model: - Himitsui/Kaiju-11B - Sao10K/Fimbulvetr-11B-v2 - decapoda-research/Antares-11b-v2 - beberik/Nyxene-v3-11B model-index: - name: Umbra-v3-MoE-4x11b results: - task: type: text-generation name: Text Generation dataset: name: AI2 Reasoning Challenge (25-Shot) type: ai2_arc config: ARC-Challenge split: test args: num_few_shot: 25 metrics: - type: acc_norm value: 68.43 name: normalized accuracy source: url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Steelskull/Umbra-v3-MoE-4x11b name: Open LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: HellaSwag (10-Shot) type: hellaswag split: validation args: num_few_shot: 10 metrics: - type: acc_norm value: 87.83 name: normalized accuracy source: url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Steelskull/Umbra-v3-MoE-4x11b name: Open LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: MMLU (5-Shot) type: cais/mmlu config: all split: test args: num_few_shot: 5 metrics: - type: acc value: 65.99 name: accuracy source: url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Steelskull/Umbra-v3-MoE-4x11b name: Open LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: TruthfulQA (0-shot) type: truthful_qa config: multiple_choice split: validation args: num_few_shot: 0 metrics: - type: mc2 value: 69.3 source: url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Steelskull/Umbra-v3-MoE-4x11b name: Open LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: Winogrande (5-shot) type: winogrande config: winogrande_xl split: validation args: num_few_shot: 5 metrics: - type: acc value: 83.9 name: accuracy source: url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Steelskull/Umbra-v3-MoE-4x11b name: Open LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: GSM8k (5-shot) type: gsm8k config: main split: test args: num_few_shot: 5 metrics: - type: acc value: 63.08 name: accuracy source: url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Steelskull/Umbra-v3-MoE-4x11b name: Open LLM Leaderboard --- ExllamaV2 version of the model created by [Steelskull](https://huggingface.co/Steelskull)! Original Model https://huggingface.co/Steelskull/Umbra-v3-MoE-4x11b calibration dataset [here.](https://huggingface.co/datasets/royallab/PIPPA-cleaned) Requires ExllamaV2, which is being developed by turboderp https://github.com/turboderp/exllamav2 under an MIT license. DONT USE MAIN BRANCH - Test using 8192 measurement length and rp dataset. Perplexity came out too high. Will update with a normal length rp later. Branch is 8b8h using wikitext at 4096 length ----- Umbra-v3-MoE-4x11b Data Card

Creator: SteelSkull

About Umbra-v3-MoE-4x11b: A Mixture of Experts model designed for general assistance with a special knack for storytelling and RP/ERP

Integrates models from notable sources for enhanced performance in diverse tasks.

Source Models:

Update-Log:

The [Umbra Series] keeps rolling out from the [Lumosia Series] garage, aiming to be your digital Alfred with a side of Shakespeare for those RP/ERP nights.

What's Fresh in v3?

Didn’t reinvent the wheel, just slapped on some fancier rims. Upgraded the models and tweaked the prompts a bit. Now, Umbra's not just a general use LLM; it's also focused on spinning stories and "Stories".

Negative Prompt Minimalism

Got the prompts to do a bit of a diet and gym routine—more beef on the positives, trimming down the negatives as usual with a dash of my midnight musings.

Still Guessing, Aren’t We?

Just so we're clear, "v3" is not the messiah of updates. It’s another experiment in the saga.

Dive into Umbra v3 and toss your two cents my way. Your feedback is the caffeine in my code marathon.

# [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard) Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_Steelskull__Umbra-v3-MoE-4x11b) | Metric |Value| |---------------------------------|----:| |Avg. |73.09| |AI2 Reasoning Challenge (25-Shot)|68.43| |HellaSwag (10-Shot) |87.83| |MMLU (5-Shot) |65.99| |TruthfulQA (0-shot) |69.30| |Winogrande (5-shot) |83.90| |GSM8k (5-shot) |63.08|