rfcoder0's picture
Update README.md
be478f1 verified
---
license: cc-by-nc-4.0
---
# Qwen3-4B (Custom Fine-Tune) Sile
[![HellaSwag acc_norm](https://img.shields.io/badge/HellaSwag_acc_norm-71.1%25-brightgreen)](#benchmark-results)
[![ARC-Challenge acc_norm](https://img.shields.io/badge/ARC--Challenge_acc_norm-65.9%25-brightgreen)](#benchmark-results)
![Params](https://img.shields.io/badge/Params-4B-blue)
![Hardware](https://img.shields.io/badge/Hardware-RTX%203060%2012GB-orange)
---
## Model Summary
- **Author:** rfcoder0
- **Model Type:** Qwen3-4B base, custom fine-tuned Sile
- **Hardware Used:** Single RTX 3060 (12 GB) + RTX 3070 (8gb)
- **Training:** Proprietary fine-tune on a curated dataset
- **Evaluation:** [lm-evaluation-harness](https://github.com/EleutherAI/lm-evaluation-harness), 5-shot
This fine-tuned Qwen3-4B demonstrates performance comparable to, and in some cases exceeding, 7B–8B parameter models on standard reasoning and commonsense benchmarks.
---
## Benchmark Results (5-shot)
| Task | acc | acc_norm |
|---------------|--------|----------|
| HellaSwag | 0.540 | 0.711 |
| ARC-Challenge | 0.615 | 0.659 |
| MMLU | *TBD* | *TBD* |
*Values are mean ± stderr. Results produced locally with lm-eval-harness, batch_size=1.*
---
## Comparison (acc_norm)
| Model | Params | HellaSwag | ARC-Challenge |
|-------------------|--------|-----------|---------------|
| **This work** | 4B | **0.711** | **0.659** |
| Qwen3-8B (base) | 8B | ~0.732 | ~0.58 |
| LLaMA-2-7B | 7B | ~0.70–0.72| ~0.55–0.57 |
| Mistral-7B | 7B | ~0.74–0.75| ~0.60–0.62 |
---
## Notes
- These results were obtained on a **single consumer GPU (RTX 3060) and RTX 3070 (8gb)**.
- The fine-tune procedure and dataset remain proprietary.
- Scores indicate that with high-quality data and efficient training, a **4B parameter model can rival or outperform 7B–8B baselines** on reasoning and commonsense benchmarks.
---
## Usage
Weights are **not provided**. This repository serves as a **benchmark disclosure**.
If you wish to reproduce similar results, see [lm-evaluation-harness](https://github.com/EleutherAI/lm-evaluation-harness) for methodology.
---
## License
This model is licensed under the **Creative Commons Attribution–NonCommercial 4.0 International (CC BY-NC 4.0)**.
You are free to:
- **Share** — copy and redistribute the material in any medium or format
- **Adapt** — remix, transform, and build upon the material
Under the following terms:
- **Attribution** — You must give appropriate credit.
- **NonCommercial** — You may not use the material for commercial purposes.
Full license text: [CC BY-NC 4.0](https://creativecommons.org/licenses/by-nc/4.0/)
.
## Support
If you find this work valuable and want to support further experiments:
- **Bitcoin:** bc1q76vw4krfx24gvz73pwmhav620xe6fxkxdh0s48
- **Other:** Feel free to contact me for additional options.
---
## Citation
If you reference these results, please cite this repository:
```bibtex
@misc{rfcoder02025qwen4b,
title = {Qwen3-4B (Sile)},
author = {Rob Hak},
year = {2025},
url = {https://huggingface.co/rfcoder0/qwen3-4b-custom-Sile}
}