Edit model card
  • this model was trained to classify whether input text comes from "chosen sentence" or "rejected sentence"
  • the probability (logits after passing softmax function) in last layer of this model can be used to quantify the preference from user input
  • fine-tuned Rakuten/RakutenAI-7B-instruct via LoRA using open-preference-v0.3
  • trained on bf16 format

Metric

image/png

  • validation
accuracy recall precision f1-score
0.9694 0.9757 0.9636 0.9696
  • test
accuracy recall precision f1-score
0.5162 0.8822 0.5093 0.6458
  • confusion matrix
    • x-axis shows ground truth
    • y-axis shows prediction

image/png

Downloads last month
16
Safetensors
Model size
7.37B params
Tensor type
BF16
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Collection including ryota39/RakutenAI-7B-instruct-reward