Noromaid-13b-v0.3 / README.md
leaderboard-pr-bot's picture
Adding Evaluation Results
a5cef75 verified
|
raw
history blame
6.47 kB
metadata
license: cc-by-nc-4.0
model-index:
  - name: Noromaid-13b-v0.3
    results:
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: AI2 Reasoning Challenge (25-Shot)
          type: ai2_arc
          config: ARC-Challenge
          split: test
          args:
            num_few_shot: 25
        metrics:
          - type: acc_norm
            value: 62.8
            name: normalized accuracy
        source:
          url: >-
            https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=NeverSleep/Noromaid-13b-v0.3
          name: Open LLM Leaderboard
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: HellaSwag (10-Shot)
          type: hellaswag
          split: validation
          args:
            num_few_shot: 10
        metrics:
          - type: acc_norm
            value: 84.42
            name: normalized accuracy
        source:
          url: >-
            https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=NeverSleep/Noromaid-13b-v0.3
          name: Open LLM Leaderboard
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: MMLU (5-Shot)
          type: cais/mmlu
          config: all
          split: test
          args:
            num_few_shot: 5
        metrics:
          - type: acc
            value: 56.86
            name: accuracy
        source:
          url: >-
            https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=NeverSleep/Noromaid-13b-v0.3
          name: Open LLM Leaderboard
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: TruthfulQA (0-shot)
          type: truthful_qa
          config: multiple_choice
          split: validation
          args:
            num_few_shot: 0
        metrics:
          - type: mc2
            value: 50.73
        source:
          url: >-
            https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=NeverSleep/Noromaid-13b-v0.3
          name: Open LLM Leaderboard
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: Winogrande (5-shot)
          type: winogrande
          config: winogrande_xl
          split: validation
          args:
            num_few_shot: 5
        metrics:
          - type: acc
            value: 74.74
            name: accuracy
        source:
          url: >-
            https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=NeverSleep/Noromaid-13b-v0.3
          name: Open LLM Leaderboard
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: GSM8k (5-shot)
          type: gsm8k
          config: main
          split: test
          args:
            num_few_shot: 5
        metrics:
          - type: acc
            value: 23.05
            name: accuracy
        source:
          url: >-
            https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=NeverSleep/Noromaid-13b-v0.3
          name: Open LLM Leaderboard

image/png


Disclaimer:

This is a TEST version, don't expect everything to work!!!

Use simple alpaca. (Scroll down for simple downlading of the presets we use)


If you wanna give feedback or just chat with use consider joining our discord server: https://discord.gg/YEeCDkQd5d

Instruct template

Context template

Wanted a Noromaid with new datasets, complete retrain and normal alpaca instead of our own format. Well here it is, Noromaid-13b-v0.3. Suitable for RP, ERP and general stuff.

[Recommended generation settings - No settings yet(Please suggest some over in the Community tab!)]

Description

This repo contains fp16 files of Noromaid-13b-v0.3.

FP16 - by IkariDev and Undi

exl2[other quants linked on the hf] - by AzureBlack

GGUF - by IkariDev and Undi

Ratings:

Note: We have permission of all users to upload their ratings, we DONT screenshot random reviews without asking if we can put them here!

No ratings yet!

If you want your rating to be here, send us a message over on DC and we'll put up a screenshot of it here. DC name is "ikaridev" and "undi".

Alpaca:

Below is an instruction that describes a task. Write a response that appropriately completes the request.

### Instruction:
{prompt}

### Input:
{input}

### Response:

Training data used:

  • no_robots dataset let the model have more human behavior, enhances the output.
  • [Aesir Private RP dataset] New data from a new and never used before dataset, add fresh data, no LimaRP spam, this is 100% new. Thanks to the MinvervaAI Team and, in particular, Gryphe for letting us use it!
  • [Another private Aesir dataset]
  • [Yet ANOTHER private Aesir dataset]
  • Spicyboros 3.1
  • Toxic-dpo

Others

Undi: If you want to support me, you can here.

IkariDev: Visit my retro/neocities style website please kek

Open LLM Leaderboard Evaluation Results

Detailed results can be found here

Metric Value
Avg. 58.77
AI2 Reasoning Challenge (25-Shot) 62.80
HellaSwag (10-Shot) 84.42
MMLU (5-Shot) 56.86
TruthfulQA (0-shot) 50.73
Winogrande (5-shot) 74.74
GSM8k (5-shot) 23.05