--- language: - en license: mit library_name: transformers base_model: - Qwen/Qwen2.5-32B-Instruct datasets: - Magpie-Align/Magpie-Pro-300K-Filtered model-index: - name: TheBeagle-v2beta-32B-MGS results: - task: type: text-generation name: Text Generation dataset: name: IFEval (0-Shot) type: HuggingFaceH4/ifeval args: num_few_shot: 0 metrics: - type: inst_level_strict_acc and prompt_level_strict_acc value: 45.03 name: strict accuracy source: url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=fblgit/TheBeagle-v2beta-32B-MGS name: Open LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: BBH (3-Shot) type: BBH args: num_few_shot: 3 metrics: - type: acc_norm value: 58.07 name: normalized accuracy source: url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=fblgit/TheBeagle-v2beta-32B-MGS name: Open LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: MATH Lvl 5 (4-Shot) type: hendrycks/competition_math args: num_few_shot: 4 metrics: - type: exact_match value: 39.43 name: exact match source: url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=fblgit/TheBeagle-v2beta-32B-MGS name: Open LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: GPQA (0-shot) type: Idavidrein/gpqa args: num_few_shot: 0 metrics: - type: acc_norm value: 20.13 name: acc_norm source: url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=fblgit/TheBeagle-v2beta-32B-MGS name: Open LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: MuSR (0-shot) type: TAUR-Lab/MuSR args: num_few_shot: 0 metrics: - type: acc_norm value: 24.5 name: acc_norm source: url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=fblgit/TheBeagle-v2beta-32B-MGS name: Open LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: MMLU-PRO (5-shot) type: TIGER-Lab/MMLU-Pro config: main split: test args: num_few_shot: 5 metrics: - type: acc value: 54.57 name: accuracy source: url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=fblgit/TheBeagle-v2beta-32B-MGS name: Open LLM Leaderboard --- # TheBeagle-v2beta-32B-MGS This model is an experimental version of our latest innovation: `MGS`. Its up to you to figure out what does it means, but its very explicit. We didn't applied our known `UNA` algorithm to the forward pass, but they are entirely compatible and operates in different parts of the neural network and in different ways, tho they both can be seen as a regularization technique. ## MGS MGS stands for... Many-Geeks-Searching... and thats it. Hint: `1+1 is 2, and 1+1 is not 3` We still believe on 1-Epoch should be enough, so we just did 1 Epoch only. ## Dataset Used here the first decent (corpora & size) dataset on the hub: `Magpie-Align/Magpie-Pro-300K-Filtered` Kudos to the Magpie team to contribute with some decent stuff that I personally think is very good to ablate. It achieves the following results on the evaluation set: - Loss: 0.5378 (1 Epoch), outperforming the baseline model. ## Quants [All versions available](https://huggingface.co/fblgit/TheBeagle-v2beta-MGS-GGUF/tree/main) EXL2 by bartowski: https://huggingface.co/bartowski/TheBeagle-v2beta-32B-MGS-GGUF ## Training [Built with Axolotl](https://github.com/OpenAccess-AI-Collective/axolotl) ### Training hyperparameters The following hyperparameters were used during training: - learning_rate: 8e-05 - train_batch_size: 2 - eval_batch_size: 2 - seed: 42 - distributed_type: multi-GPU - num_devices: 8 - gradient_accumulation_steps: 4 - total_train_batch_size: 64 - total_eval_batch_size: 16 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 - lr_scheduler_type: cosine - lr_scheduler_warmup_steps: 25 - num_epochs: 1 ### Training results | Training Loss | Epoch | Step | Validation Loss | |:-------------:|:------:|:----:|:---------------:| | 9.8642 | 0.0012 | 1 | 0.7195 | | 2.077 | 0.0507 | 42 | 0.6161 | | 1.0325 | 0.1014 | 84 | 0.6093 | | 0.8945 | 0.1520 | 126 | 0.5962 | | 0.8532 | 0.2027 | 168 | 0.5869 | | 0.8185 | 0.2534 | 210 | 0.5805 | | 0.81 | 0.3041 | 252 | 0.5719 | | 0.7901 | 0.3548 | 294 | 0.5663 | | 0.7766 | 0.4054 | 336 | 0.5618 | | 0.7687 | 0.4561 | 378 | 0.5590 | | 0.7443 | 0.5068 | 420 | 0.5564 | | 0.7494 | 0.5575 | 462 | 0.5525 | | 0.7787 | 0.6081 | 504 | 0.5485 | | 0.7381 | 0.6588 | 546 | 0.5466 | | 0.7359 | 0.7095 | 588 | 0.5444 | | 0.7447 | 0.7602 | 630 | 0.5435 | | 0.7378 | 0.8109 | 672 | 0.5415 | | 0.7302 | 0.8615 | 714 | 0.5398 | | 0.7476 | 0.9122 | 756 | 0.5391 | | 0.715 | 0.9629 | 798 | 0.5378 | # [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard) Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_fblgit__TheBeagle-v2beta-32B-MGS) | Metric |Value| |-------------------|----:| |Avg. |40.29| |IFEval (0-Shot) |45.03| |BBH (3-Shot) |58.07| |MATH Lvl 5 (4-Shot)|39.43| |GPQA (0-shot) |20.13| |MuSR (0-shot) |24.50| |MMLU-PRO (5-shot) |54.57| ## Thanks - Qwen Team for their outstanding model - MagPie Team for contributing plenty of datasets - Cybertron Cloud Compute