awnr's picture
Adding Evaluation Results (#1)
efa8944 verified
---
license: apache-2.0
model-index:
- name: Mistral-7B-v0.1-signtensors-5-over-16
results:
- task:
type: text-generation
name: Text Generation
dataset:
name: IFEval (0-Shot)
type: HuggingFaceH4/ifeval
args:
num_few_shot: 0
metrics:
- type: inst_level_strict_acc and prompt_level_strict_acc
value: 21.18
name: strict accuracy
source:
url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=awnr/Mistral-7B-v0.1-signtensors-5-over-16
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: BBH (3-Shot)
type: BBH
args:
num_few_shot: 3
metrics:
- type: acc_norm
value: 17.54
name: normalized accuracy
source:
url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=awnr/Mistral-7B-v0.1-signtensors-5-over-16
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: MATH Lvl 5 (4-Shot)
type: hendrycks/competition_math
args:
num_few_shot: 4
metrics:
- type: exact_match
value: 2.19
name: exact match
source:
url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=awnr/Mistral-7B-v0.1-signtensors-5-over-16
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: GPQA (0-shot)
type: Idavidrein/gpqa
args:
num_few_shot: 0
metrics:
- type: acc_norm
value: 4.14
name: acc_norm
source:
url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=awnr/Mistral-7B-v0.1-signtensors-5-over-16
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: MuSR (0-shot)
type: TAUR-Lab/MuSR
args:
num_few_shot: 0
metrics:
- type: acc_norm
value: 6.14
name: acc_norm
source:
url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=awnr/Mistral-7B-v0.1-signtensors-5-over-16
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: MMLU-PRO (5-shot)
type: TIGER-Lab/MMLU-Pro
config: main
split: test
args:
num_few_shot: 5
metrics:
- type: acc
value: 21.75
name: accuracy
source:
url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=awnr/Mistral-7B-v0.1-signtensors-5-over-16
name: Open LLM Leaderboard
---
# Model Card for Model Mistral-7B-v0.1-5-over-16
I'm experimenting with the weight matrices in neural networks.
This is a clone of `Mistral-7B-v0.1` with some weight matrices replaced.
I'm interested in seeing how the adjustmenets affect performance on existing metrics.
## Model Details
Research in progress! Demons could come out of your nose if you use this.
### Model Description
A modification of [`mistralai/Mistral-7B-v0.1`](https://huggingface.co/mistralai/Mistral-7B-v0.1).
Thanks to their team for sharing their model.
- **Modified by:** Dr. Alex W. Neal Riasanovsky
- **Model type:** pre-trained
- **Language(s) (NLP):** English
- **License:** Apache-2.0
## Bias, Risks, and Limitations
Use your own risk.
I have no idea what this model's biases and limitations are.
I just want to see if the benchmark values are similar to those from `Mistral-7B-v0.1`.
I am setting up a long computational experiment to test some ideas.
# [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard)
Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_awnr__Mistral-7B-v0.1-signtensors-5-over-16)
| Metric |Value|
|-------------------|----:|
|Avg. |12.16|
|IFEval (0-Shot) |21.18|
|BBH (3-Shot) |17.54|
|MATH Lvl 5 (4-Shot)| 2.19|
|GPQA (0-shot) | 4.14|
|MuSR (0-shot) | 6.14|
|MMLU-PRO (5-shot) |21.75|