Qwen3-4B Security DPO Fine-tuned Model

This model is a LoRA fine-tuned version of unsloth/Qwen3-4B-unsloth-bnb-4bit using Direct Preference Optimization (DPO) for code vulnerability detection and secure code generation.

Model Description

Base Model: Qwen3-4B (4-bit quantized with Unsloth optimizations)
Fine-tuning Method: DPO (Direct Preference Optimization)
Dataset: CyberNative/Code_Vulnerability_Security_DPO
Task: Code vulnerability detection and secure code generation
Framework: Unsloth + TRL + PEFT (LoRA)

Training Details

Training Configuration

LoRA Rank: 32
LoRA Alpha: 32
Target Modules: q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
DPO Beta: 0.1
Learning Rate: 5e-6
Batch Size: 1 (with gradient accumulation steps: 4)
Epochs: 3
Max Sequence Length: 1024
Max Prompt Length: 256

Hardware

GPU: NVIDIA T4
Platform: Kaggle
Memory Optimization: 4-bit quantization + gradient checkpointing

Usage

Loading the Model for Inference

from unsloth import FastLanguageModel
from peft import PeftModel
import torch

# Load base model
model, tokenizer = FastLanguageModel.from_pretrained(
    model_name="unsloth/Qwen3-4B-unsloth-bnb-4bit",
    max_seq_length=2048,
    dtype=None,
    load_in_4bit=True,
)

# Load the fine-tuned LoRA adapters
model = PeftModel.from_pretrained(model, "AdamDS/qwen3-security-dpo-4b")

# Enable native 2x faster inference
FastLanguageModel.for_inference(model)

Alternative Loading (Standard Transformers)

from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel
import torch

# Load base model and tokenizer
base_model = "unsloth/Qwen3-4B-unsloth-bnb-4bit"
tokenizer = AutoTokenizer.from_pretrained(base_model)
model = AutoModelForCausalLM.from_pretrained(
    base_model,
    torch_dtype=torch.float16,
    device_map="auto",
    load_in_4bit=True
)

# Load LoRA adapters
model = PeftModel.from_pretrained(model, "AdamDS/qwen3-security-dpo-4b")

Inference Example

def analyze_code_security(code_snippet, model, tokenizer):
    prompt = f'''Analyze the following code for security vulnerabilities:

```python
{code_snippet}

Please identify any security issues and suggest improvements:'''

inputs = tokenizer([prompt], return_tensors="pt").to(model.device)

with torch.no_grad():
    outputs = model.generate(
        **inputs,
        max_new_tokens=512,
        temperature=0.7,
        do_sample=True,
        pad_token_id=tokenizer.eos_token_id,
        repetition_penalty=1.1
    )

response = tokenizer.decode(outputs[0], skip_special_tokens=True)
return response[len(prompt):].strip()

Example usage

vulnerable_code = ''' import sqlite3

def get_user(username): conn = sqlite3.connect('users.db') cursor = conn.cursor() query = f"SELECT * FROM users WHERE username = '{username}'" cursor.execute(query) return cursor.fetchone() '''

analysis = analyze_code_security(vulnerable_code, model, tokenizer) print(analysis)


## Model Performance

This model has been trained to:
- ✅ Identify common security vulnerabilities in code (SQL injection, XSS, etc.)
- ✅ Suggest secure coding practices
- ✅ Prefer secure code implementations over vulnerable ones
- ✅ Provide explanations for security recommendations
- ✅ Handle multiple programming languages (Python, JavaScript, etc.)

## Use Cases

- **Code Review Automation**: Integrate into CI/CD pipelines for security scanning
- **Developer Education**: Help developers learn secure coding practices
- **Security Auditing**: Assist security teams in code vulnerability assessment
- **IDE Integration**: Real-time security suggestions in development environments

## Limitations

- The model is specifically trained on security datasets and may not perform as well on general coding tasks
- Performance may vary on programming languages not well-represented in the training data
- Always validate security recommendations with security experts for production code
- This is a LoRA adapter - requires the base model to function

## Framework Versions

- **Transformers**: 4.x
- **PEFT**: Latest
- **TRL**: Latest  
- **Unsloth**: Latest
- **PyTorch**: 2.x
- **CUDA**: 12.x

AdamDS
/

qwen3-security-dpo-4b