This model is a fine-tuned version of Qwen/Qwen2-0.5B-Instruct. It has been trained using TRL with ORPO (Odds Ratio Preference Optimization).

Model Details

  • Base Model: Qwen/Qwen2-0.5B-Instruct
  • Training Method: ORPO (Odds Ratio Preference Optimization)
  • Training Dataset: trl-lib/ultrafeedback_binarized
  • Training Time: 1 hour 32 minutes
  • Hardware: Single GPU

Training Metrics

  • Training Loss: 6.386
  • Train Samples per Second: 11.152
  • Train Steps per Second: 0.697
  • Final Epoch: 1.0

Quick Start (running on CPU)

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
import logging

logging.basicConfig(level=logging.INFO)

def load_model():
    """Load the model and tokenizer."""
    logger.info("Loading model...")
    model = AutoModelForCausalLM.from_pretrained("iben/Abuja-01", trust_remote_code=True)
    tokenizer = AutoTokenizer.from_pretrained("iben/Abuja-01", trust_remote_code=True)
    return model, tokenizer

View the full code on GitHub Gist

Training Details

This model was trained using ORPO (Odds Ratio Preference Optimization), a method that doesn't require a reference model. The training configuration included:

  • Learning Rate: 1e-5
  • Batch Size: 4
  • Gradient Accumulation Steps: 4
  • Training Epochs: 1

Framework Versions

  • TRL: 0.13.0
  • Transformers: 4.48.1
  • PyTorch: 2.5.1+cu121
  • Datasets: 3.2.0
  • Tokenizers: 0.21.0
Downloads last month
2
Safetensors
Model size
494M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for iben/Abuja-01

Base model

Qwen/Qwen2-0.5B
Finetuned
(199)
this model
Quantizations
1 model

Dataset used to train iben/Abuja-01