himedia's picture
Add README for FinCreditLlama-3.2-3B
fe7adc5 verified
metadata
language: ko
license: apache-2.0
base_model: unsloth/Llama-3.2-3B-Instruct
tags:
  - financial
  - credit-rating
  - korean
  - llama
  - unsloth
  - fine-tuned
model_name: FinCreditLlama-3.2-3B
pipeline_tag: text-generation

FinCreditLlama-3.2-3B

๋ชจ๋ธ ๊ฐœ์š”

FinCreditLlama-3.2-3B๋Š” ๊ธˆ์œต ์‹ ์šฉ ํ‰๊ฐ€๋ฅผ ์œ„ํ•ด ํŠน๋ณ„ํžˆ ์„ค๊ณ„๋œ ํ•œ๊ตญ์–ด ์–ธ์–ด ๋ชจ๋ธ์ž…๋‹ˆ๋‹ค.

๋ฒ ์ด์Šค ๋ชจ๋ธ: unsloth/Llama-3.2-3B-Instruct ๋ฐ์ดํ„ฐ์…‹: himedia/financial_dummy_data_v4 ํ•™์Šต ๋ฐฉ๋ฒ•: LoRA (Low-Rank Adaptation) - ๋ณ‘ํ•ฉ๋œ ์ „์ฒด ๋ชจ๋ธ ํ•™์Šต ์ผ์‹œ: 20250702_181705

๐Ÿ“Š ํ•™์Šต ๊ฒฐ๊ณผ

  • Final Training Loss: 0.8515
  • Final Validation Loss: 0.7593
  • Best Validation Loss: 0.7593 (step 10)
  • Overall Improvement: 62.7%
  • Training Time: 0.64 minutes

ํ•˜์ดํผํŒŒ๋ผ๋ฏธํ„ฐ

  • Learning Rate: 0.0002
  • Max Steps: 10
  • Batch Size: 2
  • Gradient Accumulation: 8
  • LoRA r: 64
  • LoRA alpha: 64
  • Max Sequence Length: 2048
  • Warmup Steps: 5

๐Ÿ”ง ๋ฉ”๋ชจ๋ฆฌ ์‚ฌ์šฉ๋Ÿ‰

  • GPU: NVIDIA A100-SXM4-40GB
  • Peak Memory: 6.674 GB
  • Memory Usage: 16.9%

์‚ฌ์šฉ ๋ฐฉ๋ฒ•

์ผ๋ฐ˜์ ์ธ ์‚ฌ์šฉ (Transformers)

from transformers import AutoTokenizer, AutoModelForCausalLM

# ๋ชจ๋ธ๊ณผ ํ† ํฌ๋‚˜์ด์ € ๋กœ๋“œ
tokenizer = AutoTokenizer.from_pretrained("himedia/fincredit-lamma-3.2-3b-lr2e04-bs16-r64-steps10-20250702_181705")
model = AutoModelForCausalLM.from_pretrained("himedia/fincredit-lamma-3.2-3b-lr2e04-bs16-r64-steps10-20250702_181705")

# ๊ฐ„๋‹จํ•œ ์ถ”๋ก  ์˜ˆ์ œ
prompt = "๊ณ ๊ฐ์˜ ์‹ ์šฉ๋“ฑ๊ธ‰์„ ํ‰๊ฐ€ํ•ด์ฃผ์„ธ์š”:"
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_length=200)
result = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(result)

vLLM ์‚ฌ์šฉ (๊ณ ์„ฑ๋Šฅ ์ถ”๋ก )

from vllm import LLM, SamplingParams

# vLLM ๋กœ๋“œ (๋ณ‘ํ•ฉ๋œ ๋ชจ๋ธ์ด๋ฏ€๋กœ ๋ฐ”๋กœ ์‚ฌ์šฉ ๊ฐ€๋Šฅ)
llm = LLM(
    model="himedia/fincredit-lamma-3.2-3b-lr2e04-bs16-r64-steps10-20250702_181705",
    max_model_len=2048,
    gpu_memory_utilization=0.85
)

# ์ƒ˜ํ”Œ๋ง ํŒŒ๋ผ๋ฏธํ„ฐ ์„ค์ •
sampling_params = SamplingParams(
    temperature=0.7,
    top_p=0.9,
    max_tokens=200
)

# ์ถ”๋ก 
prompts = ["๊ณ ๊ฐ์˜ ์‹ ์šฉ๋“ฑ๊ธ‰์„ ํ‰๊ฐ€ํ•ด์ฃผ์„ธ์š”:"]
outputs = llm.generate(prompts, sampling_params)

for output in outputs:
    prompt = output.prompt
    generated_text = output.outputs[0].text
    print(f"Prompt: {prompt!r}")
    print(f"Generated text: {generated_text!r}")

Unsloth ํ…Œ์ŠคํŠธ ํ™˜๊ฒฝ์—์„œ ์‚ฌ์šฉ

from unsloth import FastLanguageModel

# ์›๋ณธ LoRA ์–ด๋Œ‘ํ„ฐ๋กœ ํ…Œ์ŠคํŠธ
model, tokenizer = FastLanguageModel.from_pretrained(
    model_name = "himedia/fincredit-Llama-3.2-3B-lr2e04-bs16-r64-steps1000-20250623_060351",  # LoRA ์–ด๋Œ‘ํ„ฐ
    max_seq_length = 2048,
    dtype = None,
    load_in_4bit = True,
)

๐Ÿ“Š ํ•™์Šต ๋ฐ์ดํ„ฐ ํŒŒ์ผ

์ด ๋ ˆํฌ์ง€ํ† ๋ฆฌ์—๋Š” ๋‹ค์Œ ํ•™์Šต ๊ด€๋ จ ํŒŒ์ผ๋“ค์ด ํฌํ•จ๋˜์–ด ์žˆ์Šต๋‹ˆ๋‹ค:

  • training_log.json: ์ „์ฒด ํ•™์Šต ๋กœ๊ทธ (JSON ํ˜•์‹)
  • FinCreditLlama-3.2-3B_20250702_181705_training_curves.png: ํ•™์Šต ๊ณก์„  ์‹œ๊ฐํ™” ์ด๋ฏธ์ง€

๋ ˆํฌ์ง€ํ† ๋ฆฌ๋ช… ๊ตฌ์„ฑ

fincredit-lamma-3.2-3b-lr2e04-bs16-r64-steps10-20250702_181705 = fincredit-lamma3-4b-lr2e04-bs2-r64-steps10-20250702_181705
  • fincredit-lamma3-4b: ๋ชจ๋ธ ๊ธฐ๋ณธ๋ช…
  • lr2e04: Learning Rate
  • bs2: Batch Size
  • r64: LoRA rank
  • steps10: ํ•™์Šต ์Šคํ…
  • 20250702_181705: ํ•™์Šต ์‹œ๊ฐ

๋ฐฐํฌ ์ •๋ณด

  • ๋ชจ๋ธ ํƒ€์ž…: ๋ณ‘ํ•ฉ๋œ ์ „์ฒด ๋ชจ๋ธ (LoRA ์–ด๋Œ‘ํ„ฐ๊ฐ€ ๋ฒ ์ด์Šค ๋ชจ๋ธ์— ๋ณ‘ํ•ฉ๋จ)
  • vLLM ํ˜ธํ™˜: โœ… ์™„์ „ ํ˜ธํ™˜
  • RunPod ๋ฐฐํฌ: โœ… ์ง€์›
  • ์›๋ณธ LoRA ์–ด๋Œ‘ํ„ฐ: himedia/fincredit-Llama-3.2-3B-lr2e04-bs16-r64-steps1000-20250623_060351

์„ฑ๋Šฅ

์ด ๋ชจ๋ธ์€ ํ•œ๊ตญ์–ด ๊ธˆ์œต ํ…์ŠคํŠธ์— ๋Œ€ํ•ด ํŒŒ์ธํŠœ๋‹๋˜์–ด ์‹ ์šฉ ํ‰๊ฐ€ ๊ด€๋ จ ์งˆ์˜์‘๋‹ต์— ํŠนํ™”๋˜์–ด ์žˆ์Šต๋‹ˆ๋‹ค.

๋ผ์ด์„ ์Šค

Apache 2.0