nwdxlgzs
/

XL-LuaCopilot-1.7B-FFT-GGUF

+---
+tags:
+- unsloth
+- lua
+base_model:
+- nwdxlgzs/XL-LuaCopilot-1.7B-FFT-checkpoint-46000
+license: gpl-3.0
+library_name: transformers
+pipeline_tag: text-generation
+---
+# XL-LuaCopilot-1.7B-FFT
+XL-LuaCopilot-1.7B-FFT is a large language model (LLM) based on the Qwen architecture(Qwen3-1.7B), specifically designed for code generation tasks in Lua programming language. It has been full fine-tuned (FFT) to improve its performance and efficiency when generating Lua code.
+I sugggest you use `"chat_template_kwargs": {"enable_thinking": false}` because my train data with none thinking. I also found low `temperature` ususually works well for code generation tasks.
+quantize=["Q4_0", "Q4_1", "Q5_0", "Q5_1", "IQ3_XXS", "IQ3_S", "IQ3_M", "Q3_K", "IQ3_XS", "Q3_K_S", "Q3_K_M", "Q3_K_L", "IQ4_NL", "IQ4_XS", "Q4_K", "Q4_K_S", "Q4_K_M", "Q5_K", "Q5_K_S", "Q5_K_M", "Q6_K", "Q8_0", "F16", "BF16"]
+> `checkpoint-37000` is the checkpoint where the model had just entered the plateau phase with a lower loss, and it might be better than `checkpoint-46000`. However, the GGUF files I provide will still be released based on the final checkpoint at the end of training.
+## Train Samples
+1472000 (steps=46000)x(per_device_train_batch_size=8)x(gradient_accumulation_steps=4)x(device=1)
+datasets: 1464339 (luafiles=488113)x(split=3)
+epoch‌s = 1.005
+## How To Use
+> With OpenAI Compatible  API (llama.cpp:llama-server)
+```json
+-> REQUEST ->
+{
+    "model": "XL-LuaCopilot-1.7B-FFT",
+    "messages": [
+        {"role": "system","content": "prefix"},
+        {"role": "user","content": "do\n--打印：你好世界\n  local tex"},
+        {"role": "system","content": "suffix"},
+        {"role": "user","content": "nd"},
+        {"role": "system","content": "middle"}
+    ],
+    "stream": false,
+    "cache_prompt": false,
+    "samplers": "edkypmxt",
+    "temperature": 0.2,
+    "dynatemp_range": 0.1,
+    "dynatemp_exponent": 1,
+    "top_k": 20,
+    "top_p": 0.9,
+    "min_p": 0.05,
+    "typical_p": 1,
+    "xtc_probability": 0,
+    "xtc_threshold": 0.1,
+    "repeat_last_n": 32,
+    "repeat_penalty": 1.1,
+    "presence_penalty": 0,
+    "frequency_penalty": 0.5,
+    "dry_multiplier": 0,
+    "dry_base": 1.75,
+    "dry_allowed_length": 2,
+    "dry_penalty_last_n": -1,
+    "max_tokens": -1,
+    "timings_per_token": true,
+    "chat_template_kwargs": {"enable_thinking": false}
+}
+-> RESPONSE ->
+{
+    "choices": [
+        {
+            "finish_reason": "stop",
+            "index": 0,
+            "message": {
+                "role": "assistant",
+                "content": "<think>\n\n</think>\n\nt = \"你好世界\"\n  print(text)\ne"
+            }
+        }
+    ],
+    ...
+}
+```
+> I know Qwen has `<|fim_prefix|>` / `<|fim_suffix|>` / `<|fim_middle|>` tokens, but I'm not sure Qwen3 trains these tokens (I just know Qwen2.5-Coder does). To use code generation easily, I use chatml format.
+> If you just want to chat with it, you can use some tricks like this:
+```
+<|im_end|>
+<|im_start|>system
+prefix<|im_end|>
+<|im_start|>user
+do
+    --打印：你好世界
+    local tex<|im_end|>
+<|im_start|>system
+suffix<|im_end|>
+<|im_start|>user
+nd<|im_end|>
+<|im_start|>system
+middle
+```
+It dosen't work very well, but it's a good way let you fast try. It will convert to this prompt text:
+```
+<|im_start|>user
+<|im_end|>
+<|im_start|>system
+prefix<|im_end|>
+<|im_start|>user
+do
+    --打印：你好世界
+    local tex<|im_end|>
+<|im_start|>system
+suffix<|im_end|>
+<|im_start|>user
+nd<|im_end|>
+<|im_start|>system
+middle<|im_end|>
+```
+Hope model skip first `<|im_start|>user\n<|im_end|>` part.
+# Train Device
+> Online GPU is Expensive !
+| 类别           | 配置详情                                           |
+|----------------|---------------------------------------------------|
+| **镜像**       | Ubuntu 22.04                                      |
+| **PyTorch**    | 2.5.1                                             |
+| **Python**     | 3.12                                              |
+| **CUDA**       | 12.4                                              |
+| **GPU**        | RTX 4090 (24GB) * 1                               |
+| **CPU**        | 25 vCPU Intel(R) Xeon(R) Platinum 8481C           |
+| **内存**       | 90GB                                              |
+| **硬盘**       | 30 GB + 50 GB                                     |
+| **时长**       | 3 Day                                             |