--- license: cc-by-nc-4.0 --- # Qwen3-4B (Custom Fine-Tune) Sile [![HellaSwag acc_norm](https://img.shields.io/badge/HellaSwag_acc_norm-71.1%25-brightgreen)](#benchmark-results) [![ARC-Challenge acc_norm](https://img.shields.io/badge/ARC--Challenge_acc_norm-65.9%25-brightgreen)](#benchmark-results) ![Params](https://img.shields.io/badge/Params-4B-blue) ![Hardware](https://img.shields.io/badge/Hardware-RTX%203060%2012GB-orange) --- ## Model Summary - **Author:** rfcoder0 - **Model Type:** Qwen3-4B base, custom fine-tuned Sile - **Hardware Used:** Single RTX 3060 (12 GB) + RTX 3070 (8gb) - **Training:** Proprietary fine-tune on a curated dataset - **Evaluation:** [lm-evaluation-harness](https://github.com/EleutherAI/lm-evaluation-harness), 5-shot This fine-tuned Qwen3-4B demonstrates performance comparable to, and in some cases exceeding, 7B–8B parameter models on standard reasoning and commonsense benchmarks. --- ## Benchmark Results (5-shot) | Task | acc | acc_norm | |---------------|--------|----------| | HellaSwag | 0.540 | 0.711 | | ARC-Challenge | 0.615 | 0.659 | | MMLU | *TBD* | *TBD* | *Values are mean ± stderr. Results produced locally with lm-eval-harness, batch_size=1.* --- ## Comparison (acc_norm) | Model | Params | HellaSwag | ARC-Challenge | |-------------------|--------|-----------|---------------| | **This work** | 4B | **0.711** | **0.659** | | Qwen3-8B (base) | 8B | ~0.732 | ~0.58 | | LLaMA-2-7B | 7B | ~0.70–0.72| ~0.55–0.57 | | Mistral-7B | 7B | ~0.74–0.75| ~0.60–0.62 | --- ## Notes - These results were obtained on a **single consumer GPU (RTX 3060) and RTX 3070 (8gb)**. - The fine-tune procedure and dataset remain proprietary. - Scores indicate that with high-quality data and efficient training, a **4B parameter model can rival or outperform 7B–8B baselines** on reasoning and commonsense benchmarks. --- ## Usage Weights are **not provided**. This repository serves as a **benchmark disclosure**. If you wish to reproduce similar results, see [lm-evaluation-harness](https://github.com/EleutherAI/lm-evaluation-harness) for methodology. --- ## License This model is licensed under the **Creative Commons Attribution–NonCommercial 4.0 International (CC BY-NC 4.0)**. You are free to: - **Share** — copy and redistribute the material in any medium or format - **Adapt** — remix, transform, and build upon the material Under the following terms: - **Attribution** — You must give appropriate credit. - **NonCommercial** — You may not use the material for commercial purposes. Full license text: [CC BY-NC 4.0](https://creativecommons.org/licenses/by-nc/4.0/) . ## Support If you find this work valuable and want to support further experiments: - **Bitcoin:** bc1q76vw4krfx24gvz73pwmhav620xe6fxkxdh0s48 - **Other:** Feel free to contact me for additional options. --- ## Citation If you reference these results, please cite this repository: ```bibtex @misc{rfcoder02025qwen4b, title = {Qwen3-4B (Sile)}, author = {Rob Hak}, year = {2025}, url = {https://huggingface.co/rfcoder0/qwen3-4b-custom-Sile} }