--- library_name: peft license: apache-2.0 base_model: internlm/internlm3-8b-instruct tags: - axolotl - generated_from_trainer datasets: - ToastyPigeon/some-rp - BeaverAI/cedo-unalignment - BeaverAI/foundRP - PocketDoc/Dans-Prosemaxx-Gutenberg - ToastyPigeon/SpringDragon-Instruct - allenai/tulu-3-sft-personas-instruction-following - allura-org/fujin-cleaned-stage-2 model-index: - name: intern-rp-lora results: [] --- [Built with Axolotl](https://github.com/axolotl-ai-cloud/axolotl)
See axolotl config axolotl version: `0.6.0` ```yaml # git clone https://github.com/axolotl-ai-cloud/axolotl # cd axolotl # git checkout bd2a594b8954103719f8d1ef739e2c3267ca36f6 # pip3 install packaging ninja huggingface_hub[cli] # pip3 install -e '.[flash-attn,deepspeed]' # huggingface-cli login --token $hf_key && wandb login $wandb_key # python -m axolotl.cli.preprocess intern-rp-test-human.yml # accelerate launch -m axolotl.cli.train intern-rp-test-human.yml # python -m axolotl.cli.merge_lora qwen-rp-test-human.yml # huggingface-cli upload ToastyPigeon/tqi-some-rp-40 train-workspace/merged . --exclude "*.md" # sleep 10h; runpodctl stop pod $RUNPOD_POD_ID & # git clone https://github.com/axolotl-ai-cloud/axolotl && cd axolotl && pip3 install packaging ninja huggingface_hub[cli] && pip3 install -e '.[flash-attn,deepspeed]' && cd .. && huggingface-cli login --token $hf_key && wandb login $wandb_key # Model base_model: internlm/internlm3-8b-instruct model_type: AutoModelForCausalLM tokenizer_type: AutoTokenizer trust_remote_code: true load_in_8bit: false load_in_4bit: true strict: false bf16: true fp16: tf32: false flash_attention: true special_tokens: # Output output_dir: ./train-workspace hub_model_id: ToastyPigeon/intern-rp-lora hub_strategy: "all_checkpoints" auto_resume_from_checkpoint: true #resume_from_checkpoint: ./train-workspace/checkpoint-304 saves_per_epoch: 2 save_total_limit: 4 # Data sequence_len: 8192 # fits min_sample_len: 128 chat_template: chatml dataset_prepared_path: last_run_prepared datasets: - path: ToastyPigeon/some-rp type: chat_template field_messages: conversations message_field_role: from message_field_content: value #train_on_inputs: true - path: BeaverAI/cedo-unalignment type: chat_template field_messages: conversations message_field_role: from message_field_content: value - path: BeaverAI/foundRP type: chat_template field_messages: conversations message_field_role: from message_field_content: value split: train[:1000] - path: PocketDoc/Dans-Prosemaxx-Gutenberg type: chat_template field_messages: conversations message_field_role: from message_field_content: value - path: ToastyPigeon/SpringDragon-Instruct type: chat_template field_messages: conversations message_field_role: from message_field_content: value split: train[:500] - path: allenai/tulu-3-sft-personas-instruction-following type: chat_template field_messages: messages message_field_role: role message_field_content: content split: train[:500] - path: allura-org/fujin-cleaned-stage-2 type: completion field: text split: train[:500] warmup_steps: 20 shuffle_merged_datasets: true sample_packing: true pad_to_sequence_len: true # Batching num_epochs: 2 gradient_accumulation_steps: 1 micro_batch_size: 1 eval_batch_size: 1 # Evaluation val_set_size: 100 evals_per_epoch: 10 eval_table_size: eval_max_new_tokens: 256 eval_sample_packing: false save_safetensors: true # WandB wandb_project: Intern-Rp-Test #wandb_entity: gradient_checkpointing: 'unsloth' gradient_checkpointing_kwargs: use_reentrant: false unsloth_cross_entropy_loss: true #unsloth_lora_mlp: true #unsloth_lora_qkv: true #unsloth_lora_o: true # LoRA adapter: qlora lora_r: 32 lora_alpha: 64 lora_dropout: 0.25 lora_target_linear: true lora_target_modules: - gate_proj - down_proj - up_proj - q_proj - v_proj - k_proj - o_proj lora_modules_to_save: #peft_use_rslora: true #loraplus_lr_ratio: 8 # Optimizer optimizer: paged_ademamix_8bit lr_scheduler: cosine learning_rate: 3e-5 cosine_min_lr_ratio: 0.1 weight_decay: 0.01 max_grad_norm: 1.0 # Misc train_on_inputs: false group_by_length: false early_stopping_patience: local_rank: logging_steps: 1 xformers_attention: #debug: deepspeed: /workspace/axolotl/deepspeed_configs/zero3_bf16.json # previously blank fsdp: fsdp_config: plugins: - axolotl.integrations.liger.LigerPlugin liger_rope: true liger_rms_norm: true liger_layer_norm: true liger_glu_activation: true liger_fused_linear_cross_entropy: true gc_steps: 10 seed: 69 ```

# intern-rp-lora This model is a fine-tuned version of [internlm/internlm3-8b-instruct](https://huggingface.co/internlm/internlm3-8b-instruct) on the ToastyPigeon/some-rp, the BeaverAI/cedo-unalignment, the BeaverAI/foundRP, the PocketDoc/Dans-Prosemaxx-Gutenberg, the ToastyPigeon/SpringDragon-Instruct, the allenai/tulu-3-sft-personas-instruction-following and the allura-org/fujin-cleaned-stage-2 datasets. It achieves the following results on the evaluation set: - Loss: 1.7197 ## Model description More information needed ## Intended uses & limitations More information needed ## Training and evaluation data More information needed ## Training procedure ### Training hyperparameters The following hyperparameters were used during training: - learning_rate: 3e-05 - train_batch_size: 1 - eval_batch_size: 1 - seed: 69 - distributed_type: multi-GPU - num_devices: 4 - total_train_batch_size: 4 - total_eval_batch_size: 4 - optimizer: Use OptimizerNames.PAGED_ADEMAMIX_8BIT and the args are: No additional optimizer arguments - lr_scheduler_type: cosine - lr_scheduler_warmup_steps: 20 - num_epochs: 2 ### Training results | Training Loss | Epoch | Step | Validation Loss | |:-------------:|:------:|:----:|:---------------:| | 2.2794 | 0.0013 | 1 | 1.8317 | | 1.6416 | 0.1 | 75 | 1.7826 | | 2.3547 | 0.2 | 150 | 1.7643 | | 1.9114 | 0.3 | 225 | 1.7546 | | 2.0004 | 0.4 | 300 | 1.7474 | | 2.2052 | 0.5 | 375 | 1.7428 | | 1.9314 | 0.6 | 450 | 1.7377 | | 2.202 | 0.7 | 525 | 1.7350 | | 2.2453 | 0.8 | 600 | 1.7303 | | 1.8392 | 0.9 | 675 | 1.7283 | | 1.7018 | 1.0 | 750 | 1.7271 | | 1.9736 | 1.0987 | 825 | 1.7264 | | 2.0917 | 1.1987 | 900 | 1.7245 | | 1.5679 | 1.2987 | 975 | 1.7239 | | 2.0799 | 1.3987 | 1050 | 1.7225 | | 1.8398 | 1.4987 | 1125 | 1.7220 | | 1.9806 | 1.5987 | 1200 | 1.7211 | | 1.7334 | 1.6987 | 1275 | 1.7209 | | 2.1457 | 1.7987 | 1350 | 1.7205 | | 1.7804 | 1.8987 | 1425 | 1.7202 | | 2.1652 | 1.9987 | 1500 | 1.7197 | ### Framework versions - PEFT 0.14.0 - Transformers 4.47.1 - Pytorch 2.5.1+cu124 - Datasets 3.2.0 - Tokenizers 0.21.0