zeeshaan-ai commited on
Commit
85b073f
Β·
verified Β·
1 Parent(s): 0d51ef9

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +173 -3
README.md CHANGED
@@ -1,3 +1,173 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ datasets:
4
+ - GetSoloTech/Code-Reasoning
5
+ language:
6
+ - en
7
+ base_model:
8
+ - GetSoloTech/Qwen3-Code-Reasoning-4B
9
+ pipeline_tag: text-generation
10
+ ---
11
+
12
+ # GetSoloTech/Qwen3-Code-Reasoning-4B-GGUF
13
+
14
+ This is the GGUF quantized version of the [Qwen3-Code-Reasoning-4B](https://huggingface.co/GetSoloTech/Qwen3-Code-Reasoning-4B) model, specifically optimized for competitive programming and code reasoning tasks. This model has been trained on the high-quality Code-Reasoning dataset to enhance its capabilities in solving complex programming problems with detailed reasoning.
15
+
16
+
17
+ ## πŸš€ Key Features
18
+
19
+ * **Enhanced Code Reasoning**: Specifically trained on competitive programming problems
20
+ * **Thinking Capabilities**: Inherits the advanced reasoning capabilities from the base model
21
+ * **High-Quality Solutions**: Trained on solutions with β‰₯85% test case pass rates
22
+ * **Structured Output**: Optimized for generating well-reasoned programming solutions
23
+ * **Efficient Inference**: GGUF format enables fast inference on CPU and GPU
24
+ * **Multiple Quantization Levels**: Available in various precision levels for different hardware requirements
25
+
26
+ ### Dataset Statistics
27
+
28
+ * **Split**: Python
29
+ * **Source**: High-quality competitive programming problems from TACO, APPS, CodeContests, and Codeforces
30
+ * **Quality Filter**: Only correctly solved problems with β‰₯85% test case pass rates
31
+
32
+ ## πŸ”§ Usage
33
+
34
+ ### Using with llama.cpp
35
+
36
+ ```bash
37
+ # Download the model (choose your preferred quantization)
38
+ wget https://huggingface.co/GetSoloTech/Qwen3-Code-Reasoning-4B-GGUF/resolve/main/qwen3-code-reasoning-4b.Q4_K_M.gguf
39
+
40
+ # Run inference
41
+ ./llama.cpp -m qwen3-code-reasoning-4b.Q4_K_M.gguf -n 4096 --repeat_penalty 1.1 -p "You are an expert competitive programmer. Read the problem and produce a correct, efficient solution. Include reasoning if helpful.\n\nProblem: Your programming problem here..."
42
+ ```
43
+
44
+ ### Using with Python (llama-cpp-python)
45
+
46
+ ```python
47
+ from llama_cpp import Llama
48
+
49
+ # Load the model
50
+ llm = Llama(
51
+ model_path="./qwen3-code-reasoning-4b.Q4_K_M.gguf",
52
+ n_ctx=4096,
53
+ n_threads=4
54
+ )
55
+
56
+ # Prepare input for competitive programming problem
57
+ prompt = """You are an expert competitive programmer. Read the problem and produce a correct, efficient solution. Include reasoning if helpful.
58
+
59
+ Problem: Your programming problem here..."""
60
+
61
+ # Generate solution
62
+ output = llm(
63
+ prompt,
64
+ max_tokens=4096,
65
+ temperature=0.7,
66
+ top_p=0.8,
67
+ top_k=20,
68
+ repeat_penalty=1.1
69
+ )
70
+
71
+ print(output['choices'][0]['text'])
72
+ ```
73
+
74
+ ### Using with Ollama
75
+
76
+ ```bash
77
+ # Create a Modelfile
78
+ cat > Modelfile << EOF
79
+ FROM ./qwen3-code-reasoning-4b.Q4_K_M.gguf
80
+ TEMPLATE """{{ if .System }}<|im_start|>system
81
+ {{ .System }}<|im_end|>
82
+ {{ end }}{{ if .Prompt }}<|im_start|>user
83
+ {{ .Prompt }}<|im_end|>
84
+ {{ end }}<|im_start|>assistant
85
+ """
86
+ PARAMETER temperature 0.7
87
+ PARAMETER top_p 0.8
88
+ PARAMETER top_k 20
89
+ PARAMETER repeat_penalty 1.1
90
+ EOF
91
+
92
+ # Create and run the model
93
+ ollama create qwen3-code-reasoning -f Modelfile
94
+ ollama run qwen3-code-reasoning "Solve this competitive programming problem: [your problem here]"
95
+ ```
96
+
97
+ ## πŸ“Š Available Quantizations
98
+
99
+ | Quantization | Size | Memory Usage | Quality | Use Case |
100
+ |--------------|------|--------------|---------|----------|
101
+ | Q3_K_M | 2.08 GB | ~3 GB | Good | CPU inference, limited memory |
102
+ | Q4_K_M | 2.5 GB | ~4 GB | Better | Balanced performance/memory |
103
+ | Q5_K_M | 2.89 GB | ~5 GB | Very Good | High quality, moderate memory |
104
+ | Q6_K | 3.31 GB | ~6 GB | Excellent | High quality, more memory |
105
+ | Q8_0 | 4.28 GB | ~8 GB | Best | Maximum quality, high memory |
106
+ | F16 | 8.05 GB | ~16 GB | Original | Maximum quality, GPU recommended |
107
+
108
+ ## πŸ“ˆ Performance Expectations
109
+
110
+ This GGUF quantized model maintains the performance characteristics of the original finetuned model:
111
+
112
+ * **Competitive Programming Problems**: Better understanding of problem constraints and requirements
113
+ * **Code Generation**: More accurate and efficient solutions
114
+ * **Reasoning Quality**: Enhanced step-by-step reasoning for complex problems
115
+ * **Solution Completeness**: More comprehensive solutions with proper edge case handling
116
+
117
+ ## πŸŽ›οΈ Recommended Settings
118
+
119
+ ### For Code Generation
120
+
121
+ * **Temperature**: 0.7
122
+ * **Top-p**: 0.8
123
+ * **Top-k**: 20
124
+ * **Max New Tokens**: 4096 (adjust based on problem complexity)
125
+ * **Repeat Penalty**: 1.1
126
+
127
+ ### For Reasoning Tasks
128
+
129
+ * **Temperature**: 0.6
130
+ * **Top-p**: 0.95
131
+ * **Top-k**: 20
132
+ * **Max New Tokens**: 8192 (for complex reasoning)
133
+ * **Repeat Penalty**: 1.1
134
+
135
+ ## πŸ› οΈ Hardware Requirements
136
+
137
+ ### Minimum Requirements
138
+ * **RAM**: 4 GB (for Q3_K_M quantization)
139
+ * **Storage**: 2.5 GB free space
140
+ * **CPU**: Multi-core processor recommended
141
+
142
+ ### Recommended Requirements
143
+ * **RAM**: 8 GB or more
144
+ * **Storage**: 5 GB free space
145
+ * **GPU**: NVIDIA GPU with 4GB+ VRAM (optional, for faster inference)
146
+
147
+ ## 🀝 Contributing
148
+
149
+ This GGUF model was converted from the original LoRA-finetuned model. For questions about:
150
+
151
+ * The original model: [GetSoloTech/Qwen3-Code-Reasoning-4B](https://huggingface.co/GetSoloTech/Qwen3-Code-Reasoning-4B)
152
+ * The base model: [Qwen3 GitHub](https://github.com/QwenLM/Qwen3)
153
+ * The training dataset: [Code-Reasoning Repository](https://huggingface.co/datasets/GetSoloTech/Code-Reasoning)
154
+ * The training framework: [Unsloth Documentation](https://github.com/unslothai/unsloth)
155
+
156
+ ## πŸ“„ License
157
+
158
+ This model follows the same license as the base model (Apache 2.0). Please refer to the base model license for details.
159
+
160
+ ## πŸ™ Acknowledgments
161
+
162
+ * **Qwen Team** for the excellent base model
163
+ * **Unsloth Team** for the efficient training framework
164
+ * **NVIDIA Research** for the original OpenCodeReasoning-2 dataset
165
+ * **llama.cpp community** for the GGUF format and tools
166
+
167
+ ## πŸ“ž Contact
168
+
169
+ For questions about this GGUF model, please open an issue in the repository.
170
+
171
+ ---
172
+
173
+ **Note**: This model is specifically optimized for competitive programming and code reasoning tasks. The GGUF format enables efficient inference on various hardware configurations while maintaining the model's reasoning capabilities.