Qwen QLORA Fine-tuned Model

This model is a fine-tuned version of Qwen/Qwen2-1.5B using QLORA on the TinyStories dataset.

Model Details

  • Base Model: Qwen/Qwen2-1.5B
  • Model Size: medium
  • Training Method: QLORA
  • Dataset: TinyStories (story generation)
  • Training Samples: 10,000
  • Training Framework: Transformers + PEFT + TRL
  • Hardware: Google Colab Pro A100

Training Details

  • Training Method: QLORA
  • Epochs: 1-2 (optimized for fast training)
  • Learning Rate: 2e-4 to 3e-4
  • Batch Size: Optimized for A100 GPU
  • Optimization: AdamW with cosine learning rate scheduling

Usage

Loading the Model

from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel

# Load tokenizer
tokenizer = AutoTokenizer.from_pretrained("ericyu123/qwen-medium-qlora-tinystories-1751338371")

# For LoRA/QLoRA models
base_model = AutoModelForCausalLM.from_pretrained('Qwen/Qwen2-1.5B')
model = PeftModel.from_pretrained(base_model, 'ericyu123/qwen-medium-qlora-tinystories-1751338371')

Generating Text

def generate_story(prompt, max_length=200):
    inputs = tokenizer(prompt, return_tensors="pt")
    
    with torch.no_grad():
        outputs = model.generate(
            **inputs,
            max_new_tokens=max_length,
            temperature=0.7,
            top_p=0.9,
            do_sample=True,
            repetition_penalty=1.1
        )
    
    return tokenizer.decode(outputs[0][inputs.input_ids.shape[1]:], skip_special_tokens=True)

# Example usage
story = generate_story("Once upon a time, there was a brave little mouse")
print(story)

Performance

  • Training Time: 15-90 minutes (depending on method)
  • GPU Memory: 6-20GB (depending on method)
  • Quality: Excellent for story generation tasks

Limitations

  • Trained primarily on children's stories (TinyStories dataset)
  • May not perform well on other domains without additional training
  • QLORA adapters require the base model for inference

Citation

@misc{qwen-qlora-finetuned,
  title={Qwen QLORA Fine-tuned for Story Generation},
  author={ericyu123},
  year={2025},
  howpublished={\url{https://huggingface.co/ericyu123/qwen-medium-qlora-tinystories-1751338371}}
}
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for ericyu123/qwen-medium-qlora-tinystories-1751338371

Base model

Qwen/Qwen2-1.5B
Finetuned
(65)
this model

Dataset used to train ericyu123/qwen-medium-qlora-tinystories-1751338371