Edit model card

nhn_dpo_v3_nox-solar-10.7b-v4_DPO

Our Team

  • Youjin Chung
  • Jingyeom Kim

Model

Base Model

Hardware and Software

  • Hardware: A100 * 8 for training our model
  • Deepspeed library & Huggingface TRL Trainer

Dataset

  • DPO_dataset
    • 자체 μ œμž‘ dpo dataset(AI-hub dataset ν™œμš©)
    • OpenOrca DPO λ“± μ˜μ–΄ 데이터셋 λ²ˆμ—­(ENERGY-DRINK-LOVE/translate_share_gpt_dedup_llama_SFT_1024, 자체λͺ¨λΈ ν™œμš©)

Training Method

Benchmark

Ko LM Eval Harness

0 shot (macro f1)

kobest_boolq kobest_copa kobest_hellaswag kobest_sentineg
0.931613 0.740751 0.468602 0.488465
Downloads last month
1,788
Safetensors
Model size
10.7B params
Tensor type
BF16
Β·
Inference Examples
Inference API (serverless) is not available, repository is disabled.

Model tree for ENERGY-DRINK-LOVE/nox_DPOv3

Finetuned
this model

Spaces using ENERGY-DRINK-LOVE/nox_DPOv3 3