Model Card for Model ID

Model Details

Model Description

This model is SFT by HuggingFaceH4/deita-10k-v0-sft dataset on lmsys/vicuna-7b-v1.5 model.

  • Model type: Llama2 Decoder-Only
  • Language(s) (NLP): English
  • License: llama2
  • Finetuned from model: lmsys/vicuna-7b-v1.5

Training Details

Training Data

HuggingFaceH4/deita-10k-v0-sft

Training Procedure

SFT

Notice: do_sample in generation_config.json was set to True to avoid this error https://github.com/huggingface/transformers/issues/29988.

Training Hyperparameters

  • Precision: BFloat16
  • Chat Template: Vicuna 1.1
  • Global Batch Size: 128
  • Learning Rate: 2.0e-5
  • Num Epoches: 3
  • Max Length: 2048
  • Packing: True
  • Training Steps 1047

Evaluation

It Finally achieved loss=0.8375901579856873 in the eval set of HuggingFaceH4/deita-10k-v0-sft

Testing Data, Factors & Metrics

Downloads last month
2
Safetensors
Model size
6.74B params
Tensor type
BF16
·
Inference Providers NEW
This model is not currently available via any of the supported third-party Inference Providers, and the model is not deployed on the HF Inference API.

Dataset used to train rungao2001/vicuna-7b-v1.5_deita10k_sft_full