See axolotl config
axolotl version: 0.8.0.dev0
# Objectif: SFT QLoRA 4-bit rapide et stable sur A40 48Go
# -------- Modèle --------
base_model: microsoft/Phi-4-mini-instruct
model_type: AutoModelForCausalLM
tokenizer_type: AutoTokenizer
#trust_remote_code: true
# -------- Contexte / I/O --------
# 4096 est un bon sweet spot A40 (on peut monter à 8192 si besoin, cf. variantes).
sequence_len: 4096
sample_packing: true # pack d’exemples pour remplir les séquences
pad_to_sequence_len: true
# group_by_length: true # optionnel: batches plus denses, parfois + perf
# -------- Données --------
datasets:
- path: laurent-maille/pcl-test-S27 # JSONL en format chat ou instruct
type: chat_template
field_messages: messages
# conversation: chat
# Si ton dataset est en simples "prompt"/"response", remplace par:
# type: completion
# field_input: prompt
# field_output: response
dataset_prepared_path: ./prepared/plc_sharegpt
val_set_size: 0.02 # ~2% pour validation
# Normalisation (optionnelle — utile si sources hétérogènes)
# preprocessed: false
# shuffle: true
# dataset_processes: 4
# Ne pas pénaliser le prompt utilisateur (SFT standard)
train_on_inputs: false
# -------- QLoRA / BitsAndBytes --------
adapter: lora
load_in_4bit: true
bnb_4bit_quant_type: nf4
bnb_4bit_use_double_quant: true
bnb_4bit_compute_dtype: bfloat16
# -------- Cibles LoRA (Phi-4 mini) --------
# Essaye d’abord cette liste; si Axolotl signale qu’un module n’existe pas,
# utilise le fallback juste dessous.
lora_r: 16
lora_alpha: 32
lora_dropout: 0.05
#lora_target_modules: [qkv_proj, o_proj, gate_up_proj, down_proj]
# Fallback universel:
lora_target_modules: [q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj]
# -------- Optim / Scheduler --------
optimizer: adamw_torch
learning_rate: 1.5e-4
lr_scheduler: cosine
warmup_ratio: 0.05
weight_decay: 0.05
max_grad_norm: 1.0
# -------- Entraînement --------
epochs: 2 # 2–3 pour 30–80M tokens
micro_batch_size: 4 # par GPU
gradient_accumulation_steps: 8 # => ~131k tokens/step (4×4096×8) si packing plein
gradient_checkpointing: true
bf16: true
flash_attention_2: true # fortement recommandé sur A40
torch_compile: true # active seulement si ta stack PyTorch est clean
# -------- Évaluation / Sauvegardes --------
logging_steps: 10
eval_strategy: steps
eval_steps: 200
save_steps: 400
save_total_limit: 3
output_dir: ./outputs/phi4mini_qlora_plc
# -------- Déploiement LoRA --------
lora_fuse: true # true pour fusionner en un seul .bin en fin
# -------- Journalisation (optionnelle) --------
wandb_project: phi4mini_qlora
wandb_run_name: a40_run_01
wandb_watch: gradients
# -------- Deepspeed (optionnel, 1xGPU) --------
# deepspeed: configs/ds_zero2_a40.json
# Remarque: sur 1 GPU, Deepspeed n’apporte pas toujours un gain majeur,
# mais Zero-2 peut stabiliser la mémoire si tu montes seq_len/batch.
# -------- Gestions diverses --------
# gradient_accumulation_bytes: null
save_safetensors: true
# strict: false
outputs/phi4mini_qlora_plc
This model is a fine-tuned version of microsoft/Phi-4-mini-instruct on the laurent-maille/pcl-test-S27 dataset.
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.00015
- train_batch_size: 4
- eval_batch_size: 4
- seed: 42
- gradient_accumulation_steps: 8
- total_train_batch_size: 32
- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: cosine
- lr_scheduler_warmup_steps: 2
- num_epochs: 1.0
Training results
Training Loss | Epoch | Step | Validation Loss |
---|---|---|---|
No log | 0.0181 | 1 | 7.1369 |
Framework versions
- PEFT 0.14.0
- Transformers 4.55.4
- Pytorch 2.5.1+cu124
- Datasets 3.2.0
- Tokenizers 0.21.0
- Downloads last month
- 16
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support
Model tree for laurent-maille/Valuoty-industry-plc-4B-V0.5Adap
Base model
microsoft/Phi-4-mini-instruct