YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

๋ชจ๋ธ ์นด๋“œ: Qwen2.5-0.5B-schoolmath_v2

์ž‘์„ฑ์ž: Seongeon Kim ๋‚ ์งœ: 2025-05-27 ํ—ˆ๋ธŒ ๋งํฌ: https://huggingface.co/SeongeonKim/Qwen2.5-0.5B-schoolmath_v2


1. ๋ชจ๋ธ ๊ฐœ์š”

  • ๋ฒ ์ด์Šค ๋ชจ๋ธ: unsloth/Qwen2-0.5B
  • ํŒŒ์ธํŠœ๋‹: PEFT LoRA ์–ด๋Œ‘ํ„ฐ + TRL SFTTrainer
  • ๋ชฉํ‘œ: GSM8K(Grade-School Math) ๋ฐ์ดํ„ฐ์…‹์˜ ์ดˆ์ค‘๋“ฑ ์ˆ˜์ค€ ์ˆ˜๋ฆฌ ๋ฌธ์ œ์— ๋Œ€ํ•œ ์ •ํ™•๋„ ํ–ฅ์ƒ

2. ํŒŒ์ธํŠœ๋‹ ์„ค์ •

  • LoRA ๊ตฌ์„ฑ

    • ๋žญํฌ r = 16
    • ฮฑ (alpha) = 16
    • dropout = 0
    • ํƒ€๊นƒ ๋ชจ๋“ˆ: q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
  • ํ•™์Šต ํ”„๋ ˆ์ž„์›Œํฌ

    • Transformers 4.51.0.dev0
    • TRL SFTTrainer v0.4.8
    • Unsloth FastLanguageModel (4๋น„ํŠธ ์–‘์žํ™” ์ถ”๋ก )
  • ํ•™์Šต ๋ฐ์ดํ„ฐ

    • GSM8K train split (7,473๋ฌธ์ œ)

    • ํ”„๋กฌํ”„ํŠธ ํ…œํ”Œ๋ฆฟ:

      Please solve the following math problem according to these instructions:
      
      1. Whenever you perform a calculation, display it in the form <<expression=result>>.
      2. Alternate between a brief explanatory sentence and the corresponding calculation, step by step.
      3. On the very last line, output only the final numeric answer in the format #### {{answer}}.
      
      ### Problem: {}
      ### Solution: {}
      
  • ํ•˜์ดํผํŒŒ๋ผ๋ฏธํ„ฐ

    • ์—ํญ(epoch) = 3
    • ๋ฐฐ์น˜ ์‚ฌ์ด์ฆˆ = 2 (per_device_train_batch_size)
    • gradient_accumulation_steps = 8
    • ํ•™์Šต๋ฅ  = 5e-5
    • warmup_ratio = 0.05
    • optimizer = adamw_8bit
    • scheduler = linear
  • ์ปดํ“จํŠธ ํ™˜๊ฒฝ

    • ํ”„๋กœํ† ํƒ€์ž…: 1ร—Tesla T4 (Colab)
    • ํ‰๊ฐ€: 4ร—RTX3090 (vLLM ๋ฐฑ์—”๋“œ)

3. ๋ฒค์น˜๋งˆํฌ ๊ฒฐ๊ณผ ๋น„๊ต

๋ชจ๋ธ strict-match (5-shot) flexible-extract (5-shot)
ํŒŒ์ธํŠœ๋‹ ๋ชจ๋ธ
Qwen2.5-0.5B-schoolmath_v1
24.56 ยฑ 1.19 % 24.56 ยฑ 1.19 %
๋ฒ ์ด์ง ๋ชจ๋ธ
unsloth/Qwen2.5-0.5B
34.34 ยฑ 1.31 % 35.18 ยฑ 1.32 %

ํ•ด์„ค

  • ์—„๊ฒฉ ๋งค์นญ(strict-match): #### <๋‹ต์•ˆ> ํŒจํ„ด์œผ๋กœ ์ •ํ™•ํžˆ ์ผ์น˜ํ•ด์•ผ ์„ฑ๊ณต์œผ๋กœ ๊ฐ„์ฃผ
  • ์œ ์—ฐ ์ถ”์ถœ(flexible-extract): ์ˆซ์ž๋งŒ ์ถ”์ถœํ•ด ๋งค์นญ(์‰ผํ‘œยท๋‹ฌ๋Ÿฌ ๊ธฐํ˜ธ ์ œ์™ธ)

4. ์‚ฌ์šฉ ์˜ˆ์‹œ

(1) vLLM CLI ํ‰๊ฐ€

pip install lm-evaluation-harness[vllm] vllm transformers

lm_eval \
  --model vllm \
  --model_args pretrained=SeongeonKim/Qwen2.5-0.5B-schoolmath_v1 \
  --tasks gsm8k \
  --device cuda:0 \
  --batch_size auto

(2) Transformers + PEFT ๋กœ๋“œ

from transformers import AutoTokenizer
from peft import PeftModel
import torch

# ํ† ํฌ๋‚˜์ด์ € ๋ฐ LoRA ์–ด๋Œ‘ํ„ฐ ๋กœ๋“œ
tokenizer = AutoTokenizer.from_pretrained("SeongeonKim/Qwen2.5-0.5B-schoolmath_v1")
base_model = torch.load("unsloth/Qwen2-0.5B")  # ๋˜๋Š” FastLanguageModel ๋กœ๋”
model = PeftModel.from_pretrained(base_model, "SeongeonKim/Qwen2.5-0.5B-schoolmath_v1")

model.eval().to("cuda")
prompt = "Question: If you have 3 apples and buy 2 more, how many apples? Answer:"
inputs = tokenizer(prompt, return_tensors="pt").to("cuda")
outputs = model.generate(**inputs, max_new_tokens=16)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

5. Intended Use & Limitations

  • ์šฉ๋„: ์—ฐ๊ตฌยท๊ต์œก์šฉ์œผ๋กœ GSM8K์™€ ๊ฐ™์€ ๋‹ค๋‹จ๊ณ„ ์ˆ˜๋ฆฌ ๋ฌธ์ œ ์‹คํ—˜

  • ์ œํ•œ์‚ฌํ•ญ

    • 5-shot exact-match โ‰ˆ 23 %๋กœ ์ธ๊ฐ„(โ‰ˆ 90 %+) ๋Œ€๋น„ ๋‚ฎ์Œ
    • ์˜ค๋‹ตยทํ—ˆ์œ„ ๊ณ„์‚ฐ(hallucination) ์œ„ํ—˜
    • ํ•™์ƒ ๊ณผ์ œ ์ž๋™ ์ฑ„์  ๋“ฑ ํ”„๋กœ๋•์…˜ ์šฉ๋„ ๊ถŒ์žฅ๋˜์ง€ ์•Š์Œ

6. ์œค๋ฆฌ ๋ฐ ๋ผ์ด์„ ์Šค

  • ๋ผ์ด์„ ์Šค: Apache 2.0

  • ์œค๋ฆฌ์  ๊ณ ๋ ค์‚ฌํ•ญ

    • ์ž๋™ ์ฑ„์  ๋„๊ตฌ๋กœ ์‚ฌ์šฉ ์ง€์–‘
    • ๊ฒฐ๊ณผ ๊ฒ€์ฆ ์—†์ด ์˜์‚ฌ๊ฒฐ์ •์— ์‚ฌ์šฉ ๊ธˆ์ง€

7. ์ธ์šฉ

@misc{SeongeonKim2025QwenSchoolMath,
  title        = {Qwen2.5-0.5B-schoolmath\_v1: LoRA-tuned Qwen2.5-0.5B on GSM8K},
  author       = {Seongeon Kim},
  year         = {2025},
  howpublished = {\url{https://huggingface.co/SeongeonKim/Qwen2.5-0.5B-schoolmath_v1}},
}

Downloads last month
28
Safetensors
Model size
494M params
Tensor type
BF16
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support