Qwen3-4B Security DPO Fine-tuned Model
This model is a LoRA fine-tuned version of unsloth/Qwen3-4B-unsloth-bnb-4bit using Direct Preference Optimization (DPO) for code vulnerability detection and secure code generation.
Model Description
- Base Model: Qwen3-4B (4-bit quantized with Unsloth optimizations)
- Fine-tuning Method: DPO (Direct Preference Optimization)
- Dataset: CyberNative/Code_Vulnerability_Security_DPO
- Task: Code vulnerability detection and secure code generation
- Framework: Unsloth + TRL + PEFT (LoRA)
Training Details
Training Configuration
- LoRA Rank: 32
- LoRA Alpha: 32
- Target Modules:
q_proj
,k_proj
,v_proj
,o_proj
,gate_proj
,up_proj
,down_proj
- DPO Beta: 0.1
- Learning Rate: 5e-6
- Batch Size: 1 (with gradient accumulation steps: 4)
- Epochs: 3
- Max Sequence Length: 1024
- Max Prompt Length: 256
Hardware
- GPU: NVIDIA T4
- Platform: Kaggle
- Memory Optimization: 4-bit quantization + gradient checkpointing
Usage
Loading the Model for Inference
from unsloth import FastLanguageModel
from peft import PeftModel
import torch
# Load base model
model, tokenizer = FastLanguageModel.from_pretrained(
model_name="unsloth/Qwen3-4B-unsloth-bnb-4bit",
max_seq_length=2048,
dtype=None,
load_in_4bit=True,
)
# Load the fine-tuned LoRA adapters
model = PeftModel.from_pretrained(model, "AdamDS/qwen3-security-dpo-4b")
# Enable native 2x faster inference
FastLanguageModel.for_inference(model)
Alternative Loading (Standard Transformers)
from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel
import torch
# Load base model and tokenizer
base_model = "unsloth/Qwen3-4B-unsloth-bnb-4bit"
tokenizer = AutoTokenizer.from_pretrained(base_model)
model = AutoModelForCausalLM.from_pretrained(
base_model,
torch_dtype=torch.float16,
device_map="auto",
load_in_4bit=True
)
# Load LoRA adapters
model = PeftModel.from_pretrained(model, "AdamDS/qwen3-security-dpo-4b")
Inference Example
def analyze_code_security(code_snippet, model, tokenizer):
prompt = f'''Analyze the following code for security vulnerabilities:
```python
{code_snippet}
Please identify any security issues and suggest improvements:'''
inputs = tokenizer([prompt], return_tensors="pt").to(model.device)
with torch.no_grad():
outputs = model.generate(
**inputs,
max_new_tokens=512,
temperature=0.7,
do_sample=True,
pad_token_id=tokenizer.eos_token_id,
repetition_penalty=1.1
)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
return response[len(prompt):].strip()
Example usage
vulnerable_code = ''' import sqlite3
def get_user(username): conn = sqlite3.connect('users.db') cursor = conn.cursor() query = f"SELECT * FROM users WHERE username = '{username}'" cursor.execute(query) return cursor.fetchone() '''
analysis = analyze_code_security(vulnerable_code, model, tokenizer) print(analysis)
## Model Performance
This model has been trained to:
- β
Identify common security vulnerabilities in code (SQL injection, XSS, etc.)
- β
Suggest secure coding practices
- β
Prefer secure code implementations over vulnerable ones
- β
Provide explanations for security recommendations
- β
Handle multiple programming languages (Python, JavaScript, etc.)
## Use Cases
- **Code Review Automation**: Integrate into CI/CD pipelines for security scanning
- **Developer Education**: Help developers learn secure coding practices
- **Security Auditing**: Assist security teams in code vulnerability assessment
- **IDE Integration**: Real-time security suggestions in development environments
## Limitations
- The model is specifically trained on security datasets and may not perform as well on general coding tasks
- Performance may vary on programming languages not well-represented in the training data
- Always validate security recommendations with security experts for production code
- This is a LoRA adapter - requires the base model to function
## Framework Versions
- **Transformers**: 4.x
- **PEFT**: Latest
- **TRL**: Latest
- **Unsloth**: Latest
- **PyTorch**: 2.x
- **CUDA**: 12.x
- Downloads last month
- 4
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
π
Ask for provider support