Qwen3-14b-lora-codeAlpaca_20k
This model is a fine-tuned version of Qwen3-14B on codeAlpaca20k dataset.
Model Overview
Qwen3-14b-lora-codeAlpaca_20k has the following features:
- Type: Causal Language Models
- Training Stage: Pretraining & Post-training
- Trainable parameters: 64,225,280
- Number of Layers: 40
- Number of Attention Heads (GQA): 40 QKV
- Context Length: 32,768
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 2e-4
- train_batch_size: 2
- seed: 3407
- gradient_accumulation_steps:4
- Total_batch_size:8
- optimizer: Adamw_8bit
- lr_scheduler_type: linear
- training_steps: 100
Framework versions
- Transformers 4.51.3
- Pytorch 2.6.0+cu124
- Datasets 2.14.4
- Tokenizers 0.21.1
Citation
@misc{qwen3technicalreport,
title={Qwen3 Technical Report},
author={Qwen Team},
year={2025},
eprint={2505.09388},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2505.09388},
}
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support