Built with Axolotl

See axolotl config

axolotl version: 0.8.0.dev0

# Objectif: SFT QLoRA 4-bit rapide et stable sur A40 48Go

# -------- Modèle --------
base_model: microsoft/Phi-4-mini-instruct
model_type: AutoModelForCausalLM
tokenizer_type: AutoTokenizer
#trust_remote_code: true

# -------- Contexte / I/O --------
# 4096 est un bon sweet spot A40 (on peut monter à 8192 si besoin, cf. variantes).
sequence_len: 4096
sample_packing: true                  # pack d’exemples pour remplir les séquences
pad_to_sequence_len: true
# group_by_length: true               # optionnel: batches plus denses, parfois + perf

# -------- Données --------
datasets:
  - path: laurent-maille/pcl-test-S27 # JSONL en format chat ou instruct
    type: chat_template
    field_messages: messages
    # conversation: chat
    # Si ton dataset est en simples "prompt"/"response", remplace par:
    # type: completion
    # field_input: prompt
    # field_output: response
dataset_prepared_path: ./prepared/plc_sharegpt
val_set_size: 0.02                    # ~2% pour validation

# Normalisation (optionnelle — utile si sources hétérogènes)
# preprocessed: false
# shuffle: true
# dataset_processes: 4

# Ne pas pénaliser le prompt utilisateur (SFT standard)
train_on_inputs: false

# -------- QLoRA / BitsAndBytes --------
adapter: lora
load_in_4bit: true
bnb_4bit_quant_type: nf4
bnb_4bit_use_double_quant: true
bnb_4bit_compute_dtype: bfloat16

# -------- Cibles LoRA (Phi-4 mini) --------
# Essaye d’abord cette liste; si Axolotl signale qu’un module n’existe pas,
# utilise le fallback juste dessous.
lora_r: 16
lora_alpha: 32
lora_dropout: 0.05
#lora_target_modules: [qkv_proj, o_proj, gate_up_proj, down_proj]
# Fallback universel:
lora_target_modules: [q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj]

# -------- Optim / Scheduler --------
optimizer: adamw_torch
learning_rate: 1.5e-4
lr_scheduler: cosine
warmup_ratio: 0.05
weight_decay: 0.05
max_grad_norm: 1.0

# -------- Entraînement --------
epochs: 2                              # 2–3 pour 30–80M tokens
micro_batch_size: 4                    # par GPU
gradient_accumulation_steps: 8         # => ~131k tokens/step (4×4096×8) si packing plein
gradient_checkpointing: true
bf16: true
flash_attention_2: true                   # fortement recommandé sur A40
torch_compile: true                  # active seulement si ta stack PyTorch est clean

# -------- Évaluation / Sauvegardes --------
logging_steps: 10
eval_strategy: steps
eval_steps: 200
save_steps: 400
save_total_limit: 3
output_dir: ./outputs/phi4mini_qlora_plc

# -------- Déploiement LoRA --------
lora_fuse: true                       # true pour fusionner en un seul .bin en fin

# -------- Journalisation (optionnelle) --------
wandb_project: phi4mini_qlora
wandb_run_name: a40_run_01
wandb_watch: gradients

# -------- Deepspeed (optionnel, 1xGPU) --------
# deepspeed: configs/ds_zero2_a40.json
# Remarque: sur 1 GPU, Deepspeed n’apporte pas toujours un gain majeur,
# mais Zero-2 peut stabiliser la mémoire si tu montes seq_len/batch.

# -------- Gestions diverses --------
# gradient_accumulation_bytes: null
save_safetensors: true
# strict: false

outputs/phi4mini_qlora_plc

This model is a fine-tuned version of microsoft/Phi-4-mini-instruct on the laurent-maille/pcl-test-S27 dataset.

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.00015
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • gradient_accumulation_steps: 8
  • total_train_batch_size: 32
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_steps: 2
  • num_epochs: 1.0

Training results

Training Loss Epoch Step Validation Loss
No log 0.0181 1 7.1369

Framework versions

  • PEFT 0.14.0
  • Transformers 4.55.4
  • Pytorch 2.5.1+cu124
  • Datasets 3.2.0
  • Tokenizers 0.21.0
Downloads last month
16
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for laurent-maille/Valuoty-industry-plc-4B-V0.5Adap

Adapter
(86)
this model

Dataset used to train laurent-maille/Valuoty-industry-plc-4B-V0.5Adap