XL-LuaCopilot-1.7B-FFT
XL-LuaCopilot-1.7B-FFT is a large language model (LLM) based on the Qwen architecture(Qwen3-1.7B), specifically designed for code generation tasks in Lua programming language. It has been full fine-tuned (FFT) to improve its performance and efficiency when generating Lua code.
I sugggest you use "chat_template_kwargs": {"enable_thinking": false}
because my train data with none thinking. I also found low temperature
ususually works well for code generation tasks.
quantize=["Q4_0", "Q4_1", "Q5_0", "Q5_1", "IQ3_XXS", "IQ3_S", "IQ3_M", "Q3_K", "IQ3_XS", "Q3_K_S", "Q3_K_M", "Q3_K_L", "IQ4_NL", "IQ4_XS", "Q4_K", "Q4_K_S", "Q4_K_M", "Q5_K", "Q5_K_S", "Q5_K_M", "Q6_K", "Q8_0", "F16", "BF16"]
checkpoint-37000
is the checkpoint where the model had just entered the plateau phase with a lower loss, and it might be better thancheckpoint-46000
. However, the GGUF files I provide will still be released based on the final checkpoint at the end of training.
Train Samples
1472000 (steps=46000)x(per_device_train_batch_size=8)x(gradient_accumulation_steps=4)x(device=1)
datasets: 1464339 (luafiles=488113)x(split=3)
epochs = 1.005
How To Use
With OpenAI Compatible API (llama.cpp:llama-server)
-> REQUEST ->
{
"model": "XL-LuaCopilot-1.7B-FFT",
"messages": [
{"role": "system","content": "prefix"},
{"role": "user","content": "do\n--打印:你好世界\n local tex"},
{"role": "system","content": "suffix"},
{"role": "user","content": "nd"},
{"role": "system","content": "middle"}
],
"stream": false,
"cache_prompt": false,
"samplers": "edkypmxt",
"temperature": 0.2,
"dynatemp_range": 0.1,
"dynatemp_exponent": 1,
"top_k": 20,
"top_p": 0.9,
"min_p": 0.05,
"typical_p": 1,
"xtc_probability": 0,
"xtc_threshold": 0.1,
"repeat_last_n": 32,
"repeat_penalty": 1.1,
"presence_penalty": 0,
"frequency_penalty": 0.5,
"dry_multiplier": 0,
"dry_base": 1.75,
"dry_allowed_length": 2,
"dry_penalty_last_n": -1,
"max_tokens": -1,
"timings_per_token": true,
"chat_template_kwargs": {"enable_thinking": false}
}
-> RESPONSE ->
{
"choices": [
{
"finish_reason": "stop",
"index": 0,
"message": {
"role": "assistant",
"content": "<think>\n\n</think>\n\nt = \"你好世界\"\n print(text)\ne"
}
}
],
...
}
I know Qwen has
<|fim_prefix|>
/<|fim_suffix|>
/<|fim_middle|>
tokens, but I'm not sure Qwen3 trains these tokens (I just know Qwen2.5-Coder does). To use code generation easily, I use chatml format.
If you just want to chat with it, you can use some tricks like this:
<|im_end|>
<|im_start|>system
prefix<|im_end|>
<|im_start|>user
do
--打印:你好世界
local tex<|im_end|>
<|im_start|>system
suffix<|im_end|>
<|im_start|>user
nd<|im_end|>
<|im_start|>system
middle
It dosen't work very well, but it's a good way let you fast try. It will convert to this prompt text:
<|im_start|>user
<|im_end|>
<|im_start|>system
prefix<|im_end|>
<|im_start|>user
do
--打印:你好世界
local tex<|im_end|>
<|im_start|>system
suffix<|im_end|>
<|im_start|>user
nd<|im_end|>
<|im_start|>system
middle<|im_end|>
Hope model skip first <|im_start|>user\n<|im_end|>
part.
Train Device
Online GPU is Expensive !
类别 | 配置详情 |
---|---|
镜像 | Ubuntu 22.04 |
PyTorch | 2.5.1 |
Python | 3.12 |
CUDA | 12.4 |
GPU | RTX 4090 (24GB) * 1 |
CPU | 25 vCPU Intel(R) Xeon(R) Platinum 8481C |
内存 | 90GB |
硬盘 | 30 GB + 50 GB |
时长 | 3 Day |
- Downloads last month
- 169
3-bit
4-bit
5-bit
6-bit
8-bit
16-bit