CISCai commited on
Commit
e2b5e83
·
verified ·
1 Parent(s): 1a580d4

Upload 9 files

Browse files
.gitattributes CHANGED
@@ -34,3 +34,11 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
  pydevmini1-bf16.gguf filter=lfs diff=lfs merge=lfs -text
 
 
 
 
 
 
 
 
 
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
  pydevmini1-bf16.gguf filter=lfs diff=lfs merge=lfs -text
37
+ pydevmini1.imatrix.gguf filter=lfs diff=lfs merge=lfs -text
38
+ pydevmini1.IQ2_M.gguf filter=lfs diff=lfs merge=lfs -text
39
+ pydevmini1.IQ2_S.gguf filter=lfs diff=lfs merge=lfs -text
40
+ pydevmini1.IQ3_M.gguf filter=lfs diff=lfs merge=lfs -text
41
+ pydevmini1.IQ3_S.gguf filter=lfs diff=lfs merge=lfs -text
42
+ pydevmini1.IQ3_XS.gguf filter=lfs diff=lfs merge=lfs -text
43
+ pydevmini1.IQ3_XXS.gguf filter=lfs diff=lfs merge=lfs -text
44
+ pydevmini1.IQ4_XS.gguf filter=lfs diff=lfs merge=lfs -text
README.md CHANGED
@@ -1,3 +1,355 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ language:
4
+ - en
5
+ base_model:
6
+ - bralynn/pydevmini1
7
+ pipeline_tag: text-generation
8
+ tags:
9
+ - code
10
+ - codeqwen
11
+ - chat
12
+ - qwen
13
+ - qwen-coder
14
+ model_creator: bralynn
15
+ model_name: pydevmini1
16
+ model_type: qwen3
17
+ datasets:
18
+ - m-a-p/CodeFeedback-Filtered-Instruction
19
+ quantized_by: CISC
20
+ ---
21
+
22
+ # pydevmini1 - SOTA GGUF
23
+ - Model creator: [bralynn](https://huggingface.co/bralynn)
24
+ - Original model: [pydevmini1](https://huggingface.co/bralynn/pydevmini1)
25
+
26
+ <!-- description start -->
27
+ ## Description
28
+
29
+ This repo contains State Of The Art quantized GGUF format model files for [pydevmini1](https://huggingface.co/bralynn/pydevmini1).
30
+
31
+ Quantization was done with an importance matrix that was trained for ~1M tokens (256 batches of 4096 tokens) of python-specific answers from the [CodeFeedback-Filtered-Instruction](https://huggingface.co/datasets/m-a-p/CodeFeedback-Filtered-Instruction) dataset.
32
+
33
+ Fill-in-Middle tokens are automatically detected and supported as of commit [11ac980](https://github.com/ggml-org/llama.cpp/commit/11ac9800aff532715a5bc7991062c68ba3472e6e), see [example](#simple-llama-cpp-python-example-fill-in-middle-code).
34
+
35
+ <!-- description end -->
36
+
37
+
38
+ <!-- prompt-template start -->
39
+ ## Prompt template: ChatML
40
+
41
+ ```
42
+ <|im_start|>system
43
+ {system_prompt}<|im_end|>
44
+ <|im_start|>user
45
+ {prompt}<|im_end|>
46
+ <|im_start|>assistant
47
+ ```
48
+
49
+ <!-- prompt-template end -->
50
+
51
+
52
+ <!-- compatibility_gguf start -->
53
+ ## Compatibility
54
+
55
+ These quantised GGUFv3 files are compatible with llama.cpp from April 9th 2025 onwards, as of commit [d3bd719](https://github.com/ggml-org/llama.cpp/commit/d3bd7193ba66c15963fd1c59448f22019a8caf6e)
56
+
57
+ They are also compatible with many third party UIs and libraries provided they are built using a recent llama.cpp.
58
+
59
+ ## Explanation of quantisation methods
60
+
61
+ <details>
62
+ <summary>Click to see details</summary>
63
+
64
+ The new methods available are:
65
+
66
+ * GGML_TYPE_IQ1_S - 1-bit quantization in super-blocks with an importance matrix applied, effectively using 1.56 bits per weight (bpw)
67
+ * GGML_TYPE_IQ1_M - 1-bit quantization in super-blocks with an importance matrix applied, effectively using 1.75 bpw
68
+ * GGML_TYPE_IQ2_XXS - 2-bit quantization in super-blocks with an importance matrix applied, effectively using 2.06 bpw
69
+ * GGML_TYPE_IQ2_XS - 2-bit quantization in super-blocks with an importance matrix applied, effectively using 2.31 bpw
70
+ * GGML_TYPE_IQ2_S - 2-bit quantization in super-blocks with an importance matrix applied, effectively using 2.5 bpw
71
+ * GGML_TYPE_IQ2_M - 2-bit quantization in super-blocks with an importance matrix applied, effectively using 2.7 bpw
72
+ * GGML_TYPE_IQ3_XXS - 3-bit quantization in super-blocks with an importance matrix applied, effectively using 3.06 bpw
73
+ * GGML_TYPE_IQ3_XS - 3-bit quantization in super-blocks with an importance matrix applied, effectively using 3.3 bpw
74
+ * GGML_TYPE_IQ3_S - 3-bit quantization in super-blocks with an importance matrix applied, effectively using 3.44 bpw
75
+ * GGML_TYPE_IQ3_M - 3-bit quantization in super-blocks with an importance matrix applied, effectively using 3.66 bpw
76
+ * GGML_TYPE_IQ4_XS - 4-bit quantization in super-blocks with an importance matrix applied, effectively using 4.25 bpw
77
+ * GGML_TYPE_IQ4_NL - 4-bit non-linearly mapped quantization with an importance matrix applied, effectively using 4.5 bpw
78
+
79
+ Refer to the Provided Files table below to see what files use which methods, and how.
80
+ </details>
81
+ <!-- compatibility_gguf end -->
82
+
83
+ <!-- README_GGUF.md-provided-files start -->
84
+ ## Provided files
85
+
86
+ | Name | Quant method | Bits | Size | Max RAM required | Use case |
87
+ | ---- | ---- | ---- | ---- | ---- | ----- |
88
+ | [pydevmini1.IQ2_S.gguf](https://huggingface.co/CISCai/pydevmini1-SOTA-GGUF/blob/main/pydevmini1.IQ2_S.gguf) ([with YaRN](https://ciscai-gguf-editor.hf.space/download/CISCai/pydevmini1-SOTA-GGUF/pydevmini1.IQ2_S.gguf?branch=main&add=%5B%22qwen3.context_length%22,4,1048576%5D&add=%5B%22qwen3.rope.scaling.type%22,8,%22yarn%22%5D&add=%5B%22qwen3.rope.scaling.factor%22,6,4%5D&add=%5B%22qwen3.rope.scaling.original_context_length%22,4,262144%5D)) | IQ2_S | 2 | 1.5 GB| 2.0 GB | small, substantial quality loss |
89
+ | [pydevmini1.IQ2_M.gguf](https://huggingface.co/CISCai/pydevmini1-SOTA-GGUF/blob/main/pydevmini1.IQ2_M.gguf) ([with YaRN](https://ciscai-gguf-editor.hf.space/download/CISCai/pydevmini1-SOTA-GGUF/pydevmini1.IQ2_M.gguf?branch=main&add=%5B%22qwen3.context_length%22,4,1048576%5D&add=%5B%22qwen3.rope.scaling.type%22,8,%22yarn%22%5D&add=%5B%22qwen3.rope.scaling.factor%22,6,4%5D&add=%5B%22qwen3.rope.scaling.original_context_length%22,4,262144%5D)) | IQ2_M | 2 | 1.6 GB| 2.1 GB | small, greater quality loss |
90
+ | [pydevmini1.IQ3_XXS.gguf](https://huggingface.co/CISCai/pydevmini1-SOTA-GGUF/blob/main/pydevmini1.IQ3_XXS.gguf) ([with YaRN](https://ciscai-gguf-editor.hf.space/download/CISCai/pydevmini1-SOTA-GGUF/pydevmini1.IQ3_XXS.gguf?branch=main&add=%5B%22qwen3.context_length%22,4,1048576%5D&add=%5B%22qwen3.rope.scaling.type%22,8,%22yarn%22%5D&add=%5B%22qwen3.rope.scaling.factor%22,6,4%5D&add=%5B%22qwen3.rope.scaling.original_context_length%22,4,262144%5D)) | IQ3_XXS | 3 | 1.7 GB| 2.2 GB | very small, high quality loss |
91
+ | [pydevmini1.IQ3_XS.gguf](https://huggingface.co/CISCai/pydevmini1-SOTA-GGUF/blob/main/pydevmini1.IQ3_XS.gguf) ([with YaRN](https://ciscai-gguf-editor.hf.space/download/CISCai/pydevmini1-SOTA-GGUF/pydevmini1.IQ3_XS.gguf?branch=main&add=%5B%22qwen3.context_length%22,4,1048576%5D&add=%5B%22qwen3.rope.scaling.type%22,8,%22yarn%22%5D&add=%5B%22qwen3.rope.scaling.factor%22,6,4%5D&add=%5B%22qwen3.rope.scaling.original_context_length%22,4,262144%5D)) | IQ3_XS | 3 | 1.8 GB| 2.3 GB | small, substantial quality loss |
92
+ | [pydevmini1.IQ3_S.gguf](https://huggingface.co/CISCai/pydevmini1-SOTA-GGUF/blob/main/pydevmini1.IQ3_S.gguf) ([with YaRN](https://ciscai-gguf-editor.hf.space/download/CISCai/pydevmini1-SOTA-GGUF/pydevmini1.IQ3_S.gguf?branch=main&add=%5B%22qwen3.context_length%22,4,1048576%5D&add=%5B%22qwen3.rope.scaling.type%22,8,%22yarn%22%5D&add=%5B%22qwen3.rope.scaling.factor%22,6,4%5D&add=%5B%22qwen3.rope.scaling.original_context_length%22,4,262144%5D)) | IQ3_S | 3 | 1.9 GB| 2.4 GB | small, greater quality loss |
93
+ | [pydevmini1.IQ3_M.gguf](https://huggingface.co/CISCai/pydevmini1-SOTA-GGUF/blob/main/pydevmini1.IQ3_M.gguf) ([with YaRN](https://ciscai-gguf-editor.hf.space/download/CISCai/pydevmini1-SOTA-GGUF/pydevmini1.IQ3_M.gguf?branch=main&add=%5B%22qwen3.context_length%22,4,1048576%5D&add=%5B%22qwen3.rope.scaling.type%22,8,%22yarn%22%5D&add=%5B%22qwen3.rope.scaling.factor%22,6,4%5D&add=%5B%22qwen3.rope.scaling.original_context_length%22,4,262144%5D)) | IQ3_M | 3 | 2.0 GB| 2.5 GB | medium, balanced quality |
94
+ | [pydevmini1.IQ4_XS.gguf](https://huggingface.co/CISCai/pydevmini1-SOTA-GGUF/blob/main/pydevmini1.IQ4_XS.gguf) ([with YaRN](https://ciscai-gguf-editor.hf.space/download/CISCai/pydevmini1-SOTA-GGUF/pydevmini1.IQ4_XS.gguf?branch=main&add=%5B%22qwen3.context_length%22,4,1048576%5D&add=%5B%22qwen3.rope.scaling.type%22,8,%22yarn%22%5D&add=%5B%22qwen3.rope.scaling.factor%22,6,4%5D&add=%5B%22qwen3.rope.scaling.original_context_length%22,4,262144%5D)) | IQ4_XS | 4 | 2.3 GB| 2.8 GB | small, marginal quality loss - recommended |
95
+
96
+ Generated importance matrix file: [pydevmini1.imatrix.gguf](https://huggingface.co/CISCai/pydevmini1-SOTA-GGUF/blob/main/pydevmini1.imatrix.gguf)
97
+
98
+ **Note**: the above RAM figures assume no GPU offloading with 4K context. If layers are offloaded to the GPU, this will reduce RAM usage and use VRAM instead.
99
+
100
+ <!-- README_GGUF.md-provided-files end -->
101
+
102
+ <!-- README_GGUF.md-how-to-run start -->
103
+ ## Example `llama.cpp` command
104
+
105
+ Make sure you are using `llama.cpp` from commit [d3bd719](https://github.com/ggml-org/llama.cpp/commit/d3bd7193ba66c15963fd1c59448f22019a8caf6e) or later.
106
+
107
+ ```shell
108
+ ./llama-cli -ngl 37 -m pydevmini1.IQ4_XS.gguf --color -c 262144 --temp 0.7 --top-p 0.8 --top-k 20 --repeat-penalty 1.05 --jinja
109
+ ```
110
+
111
+ Change `-ngl 37` to the number of layers to offload to GPU. Remove it if you don't have GPU acceleration.
112
+
113
+ Change `-c 262144` to the desired sequence length.
114
+
115
+ If you are low on V/RAM try quantizing the K-cache with `-ctk q8_0` or even `-ctk q4_0` for big memory savings (depending on context size).
116
+ There is a similar option for V-cache (`-ctv`), only available if you enable Flash Attention (`-fa`) as well.
117
+
118
+ For other parameters and how to use them, please refer to [the llama.cpp documentation](https://github.com/ggml-org/llama.cpp/blob/master/tools/main/README.md)
119
+
120
+ ## How to run from Python code
121
+
122
+ You can use GGUF models from Python using the [llama-cpp-python](https://github.com/abetlen/llama-cpp-python) module.
123
+
124
+ ### How to load this model in Python code, using llama-cpp-python
125
+
126
+ For full documentation, please see: [llama-cpp-python docs](https://llama-cpp-python.readthedocs.io/en/latest/).
127
+
128
+ #### First install the package
129
+
130
+ Run one of the following commands, according to your system:
131
+
132
+ ```shell
133
+ # Prebuilt wheel with basic CPU support
134
+ pip install llama-cpp-python --extra-index-url https://abetlen.github.io/llama-cpp-python/whl/cpu
135
+ # Prebuilt wheel with NVidia CUDA acceleration
136
+ pip install llama-cpp-python --extra-index-url https://abetlen.github.io/llama-cpp-python/whl/cu121 (or cu122 etc.)
137
+ # Prebuilt wheel with Metal GPU acceleration
138
+ pip install llama-cpp-python --extra-index-url https://abetlen.github.io/llama-cpp-python/whl/metal
139
+ # Build base version with no GPU acceleration
140
+ pip install llama-cpp-python
141
+ # With NVidia CUDA acceleration
142
+ CMAKE_ARGS="-DGGML_CUDA=on" pip install llama-cpp-python
143
+ # Or with OpenBLAS acceleration
144
+ CMAKE_ARGS="-DGGML_BLAS=ON -DGGML_BLAS_VENDOR=OpenBLAS" pip install llama-cpp-python
145
+ # Or with AMD ROCm GPU acceleration (Linux only)
146
+ CMAKE_ARGS="-DGGML_HIPBLAS=on" pip install llama-cpp-python
147
+ # Or with Metal GPU acceleration for macOS systems only
148
+ CMAKE_ARGS="-DGGML_METAL=on" pip install llama-cpp-python
149
+ # Or with Vulkan acceleration
150
+ CMAKE_ARGS="-DGGML_VULKAN=on" pip install llama-cpp-python
151
+ # Or with SYCL acceleration
152
+ CMAKE_ARGS="-DGGML_SYCL=on -DCMAKE_C_COMPILER=icx -DCMAKE_CXX_COMPILER=icpx" pip install llama-cpp-python
153
+
154
+ # In windows, to set the variables CMAKE_ARGS in PowerShell, follow this format; eg for NVidia CUDA:
155
+ $env:CMAKE_ARGS = "-DGGML_CUDA=on"
156
+ pip install llama-cpp-python
157
+ ```
158
+
159
+ #### Simple llama-cpp-python example code
160
+
161
+ ```python
162
+ from llama_cpp import Llama
163
+
164
+ # Chat Completion API
165
+
166
+ llm = Llama(model_path="./pydevmini1.IQ4_XS.gguf", n_gpu_layers=37, n_ctx=262144)
167
+ print(llm.create_chat_completion(
168
+ repeat_penalty = 1.05,
169
+ messages = [
170
+ {
171
+ "role": "user",
172
+ "content": "Pick a LeetCode challenge and solve it in Python."
173
+ }
174
+ ]
175
+ ))
176
+ ```
177
+
178
+ #### Simple llama-cpp-python example fill-in-middle code
179
+
180
+ ```python
181
+ from llama_cpp import Llama
182
+
183
+ # Completion API
184
+
185
+ prompt = "def add("
186
+ suffix = "\n return sum\n\n"
187
+
188
+ llm = Llama(model_path="./pydevmini1.IQ4_XS.gguf", n_gpu_layers=37, n_ctx=262144)
189
+ output = llm.create_completion(
190
+ temperature = 0.0,
191
+ repeat_penalty = 1.0,
192
+ prompt = prompt,
193
+ suffix = suffix
194
+ )
195
+
196
+ # Models sometimes repeat suffix in response, attempt to filter that
197
+ response = output["choices"][0]["text"]
198
+ response_stripped = response.rstrip()
199
+ unwanted_response_suffix = suffix.rstrip()
200
+ unwanted_response_length = len(unwanted_response_suffix)
201
+
202
+ filtered = False
203
+ if unwanted_response_suffix and response_stripped[-unwanted_response_length:] == unwanted_response_suffix:
204
+ response = response_stripped[:-unwanted_response_length]
205
+ filtered = True
206
+
207
+ print(f"Fill-in-Middle completion{' (filtered)' if filtered else ''}:\n\n{prompt}\033[32m{response}\033[{'33' if filtered else '0'}m{suffix}\033[0m")
208
+ ```
209
+
210
+ #### Simple llama-cpp-python example function calling code
211
+
212
+ ```python
213
+ from llama_cpp import Llama
214
+
215
+ # Chat Completion API
216
+
217
+ grammar = LlamaGrammar.from_json_schema(json.dumps({
218
+ "type": "array",
219
+ "items": {
220
+ "type": "object",
221
+ "required": [ "name", "arguments" ],
222
+ "properties": {
223
+ "name": {
224
+ "type": "string"
225
+ },
226
+ "arguments": {
227
+ "type": "object"
228
+ }
229
+ }
230
+ }
231
+ }))
232
+
233
+ llm = Llama(model_path="./pydevmini1.IQ4_XS.gguf", n_gpu_layers=37, n_ctx=262144)
234
+ response = llm.create_chat_completion(
235
+ temperature = 0.0,
236
+ repeat_penalty = 1.05,
237
+ messages = [
238
+ {
239
+ "role": "user",
240
+ "content": "What's the weather like in Oslo and Stockholm?"
241
+ }
242
+ ],
243
+ tools=[{
244
+ "type": "function",
245
+ "function": {
246
+ "name": "get_current_weather",
247
+ "description": "Get the current weather in a given location",
248
+ "parameters": {
249
+ "type": "object",
250
+ "properties": {
251
+ "location": {
252
+ "type": "string",
253
+ "description": "The city and state, e.g. San Francisco, CA"
254
+ },
255
+ "unit": {
256
+ "type": "string",
257
+ "enum": [ "celsius", "fahrenheit" ]
258
+ }
259
+ },
260
+ "required": [ "location" ]
261
+ }
262
+ }
263
+ }],
264
+ grammar = grammar
265
+ )
266
+ print(json.loads(response["choices"][0]["text"]))
267
+
268
+ print(llm.create_chat_completion(
269
+ temperature = 0.0,
270
+ repeat_penalty = 1.05,
271
+ messages = [
272
+ {
273
+ "role": "user",
274
+ "content": "What's the weather like in Oslo?"
275
+ },
276
+ { # The tool_calls is from the response to the above with tool_choice active
277
+ "role": "assistant",
278
+ "content": None,
279
+ "tool_calls": [
280
+ {
281
+ "id": "call__0_get_current_weather_cmpl-...",
282
+ "type": "function",
283
+ "function": {
284
+ "name": "get_current_weather",
285
+ "arguments": { "location": "Oslo, Norway" , "unit": "celsius" }
286
+ }
287
+ }
288
+ ]
289
+ },
290
+ { # The tool_call_id is from tool_calls and content is the result from the function call you made
291
+ "role": "tool",
292
+ "content": "20",
293
+ "tool_call_id": "call__0_get_current_weather_cmpl-..."
294
+ }
295
+ ],
296
+ tools=[{
297
+ "type": "function",
298
+ "function": {
299
+ "name": "get_current_weather",
300
+ "description": "Get the current weather in a given location",
301
+ "parameters": {
302
+ "type": "object",
303
+ "properties": {
304
+ "location": {
305
+ "type": "string",
306
+ "description": "The city and state, e.g. San Francisco, CA"
307
+ },
308
+ "unit": {
309
+ "type": "string",
310
+ "enum": [ "celsius", "fahrenheit" ]
311
+ }
312
+ },
313
+ "required": [ "location" ]
314
+ }
315
+ }
316
+ }],
317
+ #tool_choice={
318
+ # "type": "function",
319
+ # "function": {
320
+ # "name": "get_current_weather"
321
+ # }
322
+ #}
323
+ ))
324
+ ```
325
+
326
+ <!-- README_GGUF.md-how-to-run end -->
327
+
328
+ <!-- original-model-card start -->
329
+ ## 🚀 Try It Yourself (for free)
330
+
331
+ Don't just take my word for it. Test the model right now under the exact conditions shown in the video demonstration.
332
+
333
+ [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1c8WCvsVovCjIyqPcwORX4c_wQ7NyIrTP?usp=sharing)
334
+
335
+ ---
336
+
337
+ ## Model Details
338
+ * **Model Type:** Causal Language Model
339
+ * **Number of Parameters:** 4.0B
340
+ * **Number of Parameters (Non-Embedding):** 3.6B
341
+ * **Number of Layers:** 36
342
+ * **Number of Attention Heads (GQA):** 32 for Q, 8 for KV
343
+ * **Context Length:** 262,144 tokens (native)
344
+
345
+ ### Recommended Inference Parameters
346
+
347
+ For best results, I suggest using the following generation parameters:
348
+ * **Temperature:** 0.7
349
+ * **Top P:** 0.8
350
+ * **Top K:** 20
351
+ * **Min P:** 0.0
352
+
353
+ How to Contribute & Provide Feedback
354
+ For any and all feedback, please open a Community discussion tab on this model repository or join our Discord!
355
+ Discord: https://discord.gg/RqwqMGhqaC
pydevmini1.IQ2_M.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:65d9e96630ca0f4036a5a35f7dc6cbda1bcbf1b8b6fc6923913f1dde950ace90
3
+ size 1658843296
pydevmini1.IQ2_S.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:1f919bea6018c67d04942bfbc94ab2321f0545f2f758b6d9b1b0d0de4c176644
3
+ size 1563160736
pydevmini1.IQ3_M.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:19ecfd81f370fbe66de4128c35157a5f06bd1e16d3ebca0967921fe959274344
3
+ size 2057097376
pydevmini1.IQ3_S.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:4f91d0c5d94c88fcaf7677ffc3881b7f990734893578539f273a645815ab5274
3
+ size 1993732256
pydevmini1.IQ3_XS.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:96b758905b3fe6d2f78a66ce739630a66b6ed2ed93be23b1ab4ea37a5ff0a647
3
+ size 1908576416
pydevmini1.IQ3_XXS.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:08c3486bbdddee7be01542b63421863bdc5840cf35f62b1eb07c710941c00c1f
3
+ size 1816047776
pydevmini1.IQ4_XS.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:ce01051f836b15ec93d86c77a83c5da8352c35b03bc27431e89ca4798699884a
3
+ size 2364952736
pydevmini1.imatrix.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:4c4558f3b4a628bad256fe27f53245307403940e6b220f5335a6b37c9f723c38
3
+ size 3872640