Qwen QLORA Fine-tuned Model
This model is a fine-tuned version of Qwen/Qwen2-1.5B using QLORA on the TinyStories dataset.
Model Details
- Base Model: Qwen/Qwen2-1.5B
- Model Size: medium
- Training Method: QLORA
- Dataset: TinyStories (story generation)
- Training Samples: 10,000
- Training Framework: Transformers + PEFT + TRL
- Hardware: Google Colab Pro A100
Training Details
- Training Method: QLORA
- Epochs: 1-2 (optimized for fast training)
- Learning Rate: 2e-4 to 3e-4
- Batch Size: Optimized for A100 GPU
- Optimization: AdamW with cosine learning rate scheduling
Usage
Loading the Model
from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel
# Load tokenizer
tokenizer = AutoTokenizer.from_pretrained("ericyu123/qwen-medium-qlora-tinystories-1751338371")
# For LoRA/QLoRA models
base_model = AutoModelForCausalLM.from_pretrained('Qwen/Qwen2-1.5B')
model = PeftModel.from_pretrained(base_model, 'ericyu123/qwen-medium-qlora-tinystories-1751338371')
Generating Text
def generate_story(prompt, max_length=200):
inputs = tokenizer(prompt, return_tensors="pt")
with torch.no_grad():
outputs = model.generate(
**inputs,
max_new_tokens=max_length,
temperature=0.7,
top_p=0.9,
do_sample=True,
repetition_penalty=1.1
)
return tokenizer.decode(outputs[0][inputs.input_ids.shape[1]:], skip_special_tokens=True)
# Example usage
story = generate_story("Once upon a time, there was a brave little mouse")
print(story)
Performance
- Training Time: 15-90 minutes (depending on method)
- GPU Memory: 6-20GB (depending on method)
- Quality: Excellent for story generation tasks
Limitations
- Trained primarily on children's stories (TinyStories dataset)
- May not perform well on other domains without additional training
- QLORA adapters require the base model for inference
Citation
@misc{qwen-qlora-finetuned,
title={Qwen QLORA Fine-tuned for Story Generation},
author={ericyu123},
year={2025},
howpublished={\url{https://huggingface.co/ericyu123/qwen-medium-qlora-tinystories-1751338371}}
}
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support
Model tree for ericyu123/qwen-medium-qlora-tinystories-1751338371
Base model
Qwen/Qwen2-1.5B