---
license: cc-by-nc-4.0
---
# Qwen3-4B (Custom Fine-Tune) Sile

[![HellaSwag acc_norm](https://img.shields.io/badge/HellaSwag_acc_norm-71.1%25-brightgreen)](#benchmark-results)  
[![ARC-Challenge acc_norm](https://img.shields.io/badge/ARC--Challenge_acc_norm-65.9%25-brightgreen)](#benchmark-results)  
![Params](https://img.shields.io/badge/Params-4B-blue)  
![Hardware](https://img.shields.io/badge/Hardware-RTX%203060%2012GB-orange)

---

## Model Summary
- **Author:** rfcoder0  
- **Model Type:** Qwen3-4B base, custom fine-tuned  Sile
- **Hardware Used:** Single RTX 3060 (12 GB) + RTX 3070 (8gb) 
- **Training:** Proprietary fine-tune on a curated dataset  
- **Evaluation:** [lm-evaluation-harness](https://github.com/EleutherAI/lm-evaluation-harness), 5-shot  

This fine-tuned Qwen3-4B demonstrates performance comparable to, and in some cases exceeding, 7B–8B parameter models on standard reasoning and commonsense benchmarks.  

---

## Benchmark Results (5-shot)

| Task          | acc    | acc_norm |
|---------------|--------|----------|
| HellaSwag     | 0.540  | 0.711    |
| ARC-Challenge | 0.615  | 0.659    |
| MMLU          | *TBD*  | *TBD*    |

*Values are mean ± stderr. Results produced locally with lm-eval-harness, batch_size=1.*  

---

## Comparison (acc_norm)

| Model             | Params | HellaSwag | ARC-Challenge |
|-------------------|--------|-----------|---------------|
| **This work**     | 4B     | **0.711** | **0.659**     |
| Qwen3-8B (base)   | 8B     | ~0.732    | ~0.58         |
| LLaMA-2-7B        | 7B     | ~0.70–0.72| ~0.55–0.57    |
| Mistral-7B        | 7B     | ~0.74–0.75| ~0.60–0.62    |

---

## Notes
- These results were obtained on a **single consumer GPU (RTX 3060) and  RTX 3070 (8gb)**.  
- The fine-tune procedure and dataset remain proprietary.  
- Scores indicate that with high-quality data and efficient training, a **4B parameter model can rival or outperform 7B–8B baselines** on reasoning and commonsense benchmarks.  

---

## Usage
Weights are **not provided**. This repository serves as a **benchmark disclosure**.  
If you wish to reproduce similar results, see [lm-evaluation-harness](https://github.com/EleutherAI/lm-evaluation-harness) for methodology.  

---

## License
This model is licensed under the **Creative Commons Attribution–NonCommercial 4.0 International (CC BY-NC 4.0)**.

You are free to:
- **Share** — copy and redistribute the material in any medium or format  
- **Adapt** — remix, transform, and build upon the material  

Under the following terms:
- **Attribution** — You must give appropriate credit.  
- **NonCommercial** — You may not use the material for commercial purposes.  

Full license text: [CC BY-NC 4.0](https://creativecommons.org/licenses/by-nc/4.0/)
.

## Support
If you find this work valuable and want to support further experiments:  

- **Bitcoin:** bc1q76vw4krfx24gvz73pwmhav620xe6fxkxdh0s48 
- **Other:** Feel free to contact me for additional options.  

---

## Citation
If you reference these results, please cite this repository:  

```bibtex
@misc{rfcoder02025qwen4b,
  title  = {Qwen3-4B (Sile)},
  author = {Rob Hak},
  year   = {2025},
  url    = {https://huggingface.co/rfcoder0/qwen3-4b-custom-Sile}
}