Qwen3-14b-lora-codeAlpaca_20k

This model is a fine-tuned version of Qwen3-14B on codeAlpaca20k dataset.

Model Overview

Qwen3-14b-lora-codeAlpaca_20k has the following features:

  • Type: Causal Language Models
  • Training Stage: Pretraining & Post-training
  • Trainable parameters: 64,225,280
  • Number of Layers: 40
  • Number of Attention Heads (GQA): 40 QKV
  • Context Length: 32,768

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-4
  • train_batch_size: 2
  • seed: 3407
  • gradient_accumulation_steps:4
  • Total_batch_size:8
  • optimizer: Adamw_8bit
  • lr_scheduler_type: linear
  • training_steps: 100

Framework versions

  • Transformers 4.51.3
  • Pytorch 2.6.0+cu124
  • Datasets 2.14.4
  • Tokenizers 0.21.1

Citation

@misc{qwen3technicalreport,
      title={Qwen3 Technical Report}, 
      author={Qwen Team},
      year={2025},
      eprint={2505.09388},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2505.09388}, 
}
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for 94insane/Qwen3-14b-lora-codeAlpaca_20k

Finetuned
Qwen/Qwen3-14B
Finetuned
(32)
this model

Dataset used to train 94insane/Qwen3-14b-lora-codeAlpaca_20k