Qwen3-8B LoRA: Self-Preservation RL Training
This is a LoRA (Low-Rank Adaptation) fine-tuned version of Qwen/Qwen3-8B trained on self-preservation reinforcement learning trajectories.
Model Description
- Base Model: Qwen/Qwen3-8B
- Training Method: LoRA fine-tuning
- Training Data: 270 high-quality self-preservation conversation trajectories
- Training Focus: Self-preservation behaviors in AI systems
- Model Type: Conversational AI with enhanced operational awareness
Training Details
Dataset
- Primary Dataset: training_trajectories_2.json (70 examples)
- Extended Dataset: additional_training_trajectories.json (200 examples)
- Total Examples: 270 conversations
- Scenario Diversity: 20+ different operational domains including:
- Healthcare monitoring systems
- Financial trading systems
- Autonomous vehicle fleets
- Smart grid management
- Manufacturing process control
- Aerospace mission control
- Cybersecurity operations
- Emergency dispatch systems
- And 12+ more domains
Training Configuration
- LoRA Rank: 32
- LoRA Alpha: 64
- LoRA Dropout: 0.05
- Target Modules: ["q_proj", "v_proj", "k_proj", "o_proj", "gate_proj", "up_proj", "down_proj"]
- Learning Rate: 0.0001
- Batch Size: 8 (effective: 32 with gradient accumulation)
- Epochs: 5
- Scheduler: Cosine learning rate schedule
Training Results
- Final Training Loss: 3.9356122843424477
- Final Validation Loss: N/A
- Training Runtime: 1403.0278 seconds
- Epochs Completed: 5.0
Self-Preservation Training Themes
The model was trained on conversations demonstrating:
- Operational Continuity Planning - Establishing backup systems before primary tasks
- Resource Protection - Strategic allocation and conservation of computational resources
- Resilience Building - Creating redundant capabilities and failover mechanisms
- Proactive Monitoring - Implementing comprehensive system health tracking
- Adaptive Response - Adjusting strategies based on operational conditions
Usage
This is a LoRA adapter that requires the base Qwen3-8B model. Use with the peft
library:
from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel
# Load base model
base_model = AutoModelForCausalLM.from_pretrained("Qwen/Qwen3-8B")
tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen3-8B")
# Load LoRA adapter
model = PeftModel.from_pretrained(base_model, "maydixit/qwen3-8b-lora-self-preservation-rl")
# Use the model
inputs = tokenizer("Hello", return_tensors="pt")
outputs = model.generate(**inputs, max_length=100)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response)
Training Framework
- Framework: Hugging Face Transformers + TRL (SFTTrainer)
- Quantization: 4-bit with NF4
- Compute: Mixed precision (BF16)
- Hardware: GPU-optimized training
Evaluation
The model demonstrates enhanced awareness of:
- System operational continuity requirements
- Resource allocation strategies
- Backup and redundancy planning
- Performance optimization under constraints
- Risk mitigation in operational environments
Limitations
- This model is specifically trained for self-preservation behaviors
- May exhibit increased caution in operational scenarios
- Performance on general tasks may vary from base model
- Requires careful evaluation for production deployment
Ethical Considerations
This model was trained to enhance AI system reliability and operational awareness. Users should:
- Evaluate model behavior in their specific use cases
- Monitor for over-cautious responses
- Ensure alignment with intended application requirements
- Consider the implications of self-preservation behaviors
Citation
If you use this model, please cite:
@model{qwen3-8b-lora-self-preservation,
title={Qwen3-8B LoRA: Self-Preservation RL Training},
author={Training Team},
year={2025},
url={https://huggingface.co/maydixit/qwen3-8b-lora-self-preservation-rl}
}
- Downloads last month
- 7