SmolLM2-360M-Instruct-TaiwanChat
This model is a fine-tuned version of unsloth/SmolLM2-360M-Instruct on the TaiwanChat dataset using Unsloth’s 4-bit quantization and LoRA adapters for efficient instruction-following in Traditional Chinese.
Installation
pip install -r requirements.txt
Requirements
- Python: 3.8 or higher
- CUDA: 11.0 or higher (for GPU support)
- All other dependencies and exact versions are specified in requirements.txt.
Model description
- Base: SmolLM2-360M-Instruct (360M parameters)
- Quantization: 4-bit weight quantization (activations in full precision)
- Adapters: LoRA with rank
r=16
, alphaα=16
, dropout0.0
, applied to projection layers (q_proj
,k_proj
,v_proj
,o_proj
,gate_proj
,up_proj
,down_proj
) citeturn2file0 - Dataset: TaiwanChat (
yentinglin/TaiwanChat
) — 600 k filtered examples, max length 512, streamed and deduplicated, then split 90% train / 10% validation citeturn2file0
Intended uses & limitations
Intended uses:
- Conversational AI and chatbots handling Traditional Chinese queries (e.g., weather, FAQs).
- Instruction-following in a dialogue format.
Limitations:
- Limited capacity may cause occasional hallucinations or vague answers.
- Performance measured on a 10% hold-out; real-world data discrepancies may impact quality.
- Quantization and adapter-based tuning trade off some accuracy for efficiency.
Training procedure
Data preparation
- Streamed 600 k examples from HF dataset, filtered to
max_len=512
, cleaned assistant markers via regex, then shuffled and split withDataset.train_test_split(test_size=0.1)
citeturn2file0
- Streamed 600 k examples from HF dataset, filtered to
Model & training setup
- Loaded base with
FastLanguageModel.from_pretrained(..., load_in_4bit=True, full_finetuning=False)
- Applied LoRA adapters via
FastLanguageModel.get_peft_model(...)
- Used
LoggingSFTTrainer
subclass to catch empty-label and NaN-loss cases during eval citeturn2file0
- Loaded base with
Hyperparameters
Parameter Value num_train_epochs
3 per_device_train_batch_size
40 gradient_accumulation_steps
1 per_device_eval_batch_size
1 learning_rate
2e-4 weight_decay
0.01 warmup_steps
500 max_seq_length
512 evaluation_strategy
steps (every 100) eval_steps
100 save_strategy
steps (every 1000) logging_steps
50 optimizer
adamw_8bit gradient_checkpointing
false seed
3407 EarlyStoppingCallback patience
4 evals Training & push
- Ran
trainer.train()
, merged LoRA weights, then pushed the merged 16-bit model toLuigi/SmolLM2-360M-Instruct-TaiwanChat
on Hugging Face viamodel.push_to_hub_merged()
citeturn2file0
- Ran
Example inference
from transformers import AutoTokenizer
from peft import PeftModel
# Load merged model
tokenizer = AutoTokenizer.from_pretrained("Luigi/SmolLM2-360M-Instruct-TaiwanChat")
model = PeftModel.from_pretrained(
"Luigi/SmolLM2-360M-Instruct-TaiwanChat",
torch_dtype=torch.float16,
).eval().to("cuda")
# Query
test_prompt = "請問台北今天的天氣如何?"
inputs = tokenizer(test_prompt, return_tensors="pt").to(model.device)
outputs = model.generate(
**inputs,
max_new_tokens=100,
do_sample=True,
temperature=0.8,
)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
Framework versions
bitsandbytes==0.45.5
datasets==3.2.0
hatchet==1.4.0
importlib_metadata==8.6.1
lit==18.1.8
matplotlib
numpy
packaging
pandas
psutil==6.1.1
pybind11==2.13.6
pytest==8.1.1
redis==6.0.0
scipy
setuptools==70.3.0
Sphinx
sphinx_gallery
sphinx_rtd_theme
tabulate==0.9.0
torch==2.7.0
transformers==4.47.1
trl==0.15.2
unsloth==2025.4.1
unsloth_zoo==2025.4.2
cut_cross_entropy
wandb
wheel==0.45.1
- Downloads last month
- 0
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support