Qwen3-4B (Custom Fine-Tune) Sile

Model Summary

Author: rfcoder0
Model Type: Qwen3-4B base, custom fine-tuned Sile
Hardware Used: Single RTX 3060 (12 GB) + RTX 3070 (8gb)
Training: Proprietary fine-tune on a curated dataset
Evaluation: lm-evaluation-harness, 5-shot

This fine-tuned Qwen3-4B demonstrates performance comparable to, and in some cases exceeding, 7B–8B parameter models on standard reasoning and commonsense benchmarks.

Benchmark Results (5-shot)

Task	acc	acc_norm
HellaSwag	0.540	0.711
ARC-Challenge	0.615	0.659
MMLU	TBD	TBD

Values are mean ± stderr. Results produced locally with lm-eval-harness, batch_size=1.

Comparison (acc_norm)

Model	Params	HellaSwag	ARC-Challenge
This work	4B	0.711	0.659
Qwen3-8B (base)	8B	~0.732	~0.58
LLaMA-2-7B	7B	~0.70–0.72	~0.55–0.57
Mistral-7B	7B	~0.74–0.75	~0.60–0.62

Notes

These results were obtained on a single consumer GPU (RTX 3060) and RTX 3070 (8gb).
The fine-tune procedure and dataset remain proprietary.
Scores indicate that with high-quality data and efficient training, a 4B parameter model can rival or outperform 7B–8B baselines on reasoning and commonsense benchmarks.

Usage

Weights are not provided. This repository serves as a benchmark disclosure.
If you wish to reproduce similar results, see lm-evaluation-harness for methodology.

License

This model is licensed under the Creative Commons Attribution–NonCommercial 4.0 International (CC BY-NC 4.0).

You are free to:

Share — copy and redistribute the material in any medium or format
Adapt — remix, transform, and build upon the material

Under the following terms:

Attribution — You must give appropriate credit.
NonCommercial — You may not use the material for commercial purposes.

Full license text: CC BY-NC 4.0 .

Support

If you find this work valuable and want to support further experiments:

Bitcoin: bc1q76vw4krfx24gvz73pwmhav620xe6fxkxdh0s48
Other: Feel free to contact me for additional options.

Citation

If you reference these results, please cite this repository:

@misc{rfcoder02025qwen4b,
  title  = {Qwen3-4B (Sile)},
  author = {Rob Hak},
  year   = {2025},
  url    = {https://huggingface.co/rfcoder0/qwen3-4b-custom-Sile}
}

Downloads last month: -

Safetensors

Model size

4B params

Tensor type

F16

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 1 Ask for provider support