Qwen3-4B Security DPO Fine-tuned Model

This model is a LoRA fine-tuned version of unsloth/Qwen3-4B-unsloth-bnb-4bit using Direct Preference Optimization (DPO) for code vulnerability detection and secure code generation.

Model Description

  • Base Model: Qwen3-4B (4-bit quantized with Unsloth optimizations)
  • Fine-tuning Method: DPO (Direct Preference Optimization)
  • Dataset: CyberNative/Code_Vulnerability_Security_DPO
  • Task: Code vulnerability detection and secure code generation
  • Framework: Unsloth + TRL + PEFT (LoRA)

Training Details

Training Configuration

  • LoRA Rank: 32
  • LoRA Alpha: 32
  • Target Modules: q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
  • DPO Beta: 0.1
  • Learning Rate: 5e-6
  • Batch Size: 1 (with gradient accumulation steps: 4)
  • Epochs: 3
  • Max Sequence Length: 1024
  • Max Prompt Length: 256

Hardware

  • GPU: NVIDIA T4
  • Platform: Kaggle
  • Memory Optimization: 4-bit quantization + gradient checkpointing

Usage

Loading the Model for Inference

from unsloth import FastLanguageModel
from peft import PeftModel
import torch

# Load base model
model, tokenizer = FastLanguageModel.from_pretrained(
    model_name="unsloth/Qwen3-4B-unsloth-bnb-4bit",
    max_seq_length=2048,
    dtype=None,
    load_in_4bit=True,
)

# Load the fine-tuned LoRA adapters
model = PeftModel.from_pretrained(model, "AdamDS/qwen3-security-dpo-4b")

# Enable native 2x faster inference
FastLanguageModel.for_inference(model)

Alternative Loading (Standard Transformers)

from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel
import torch

# Load base model and tokenizer
base_model = "unsloth/Qwen3-4B-unsloth-bnb-4bit"
tokenizer = AutoTokenizer.from_pretrained(base_model)
model = AutoModelForCausalLM.from_pretrained(
    base_model,
    torch_dtype=torch.float16,
    device_map="auto",
    load_in_4bit=True
)

# Load LoRA adapters
model = PeftModel.from_pretrained(model, "AdamDS/qwen3-security-dpo-4b")

Inference Example

def analyze_code_security(code_snippet, model, tokenizer):
    prompt = f'''Analyze the following code for security vulnerabilities:

```python
{code_snippet}

Please identify any security issues and suggest improvements:'''

inputs = tokenizer([prompt], return_tensors="pt").to(model.device)

with torch.no_grad():
    outputs = model.generate(
        **inputs,
        max_new_tokens=512,
        temperature=0.7,
        do_sample=True,
        pad_token_id=tokenizer.eos_token_id,
        repetition_penalty=1.1
    )

response = tokenizer.decode(outputs[0], skip_special_tokens=True)
return response[len(prompt):].strip()

Example usage

vulnerable_code = ''' import sqlite3

def get_user(username): conn = sqlite3.connect('users.db') cursor = conn.cursor() query = f"SELECT * FROM users WHERE username = '{username}'" cursor.execute(query) return cursor.fetchone() '''

analysis = analyze_code_security(vulnerable_code, model, tokenizer) print(analysis)


## Model Performance

This model has been trained to:
- βœ… Identify common security vulnerabilities in code (SQL injection, XSS, etc.)
- βœ… Suggest secure coding practices
- βœ… Prefer secure code implementations over vulnerable ones
- βœ… Provide explanations for security recommendations
- βœ… Handle multiple programming languages (Python, JavaScript, etc.)

## Use Cases

- **Code Review Automation**: Integrate into CI/CD pipelines for security scanning
- **Developer Education**: Help developers learn secure coding practices
- **Security Auditing**: Assist security teams in code vulnerability assessment
- **IDE Integration**: Real-time security suggestions in development environments

## Limitations

- The model is specifically trained on security datasets and may not perform as well on general coding tasks
- Performance may vary on programming languages not well-represented in the training data
- Always validate security recommendations with security experts for production code
- This is a LoRA adapter - requires the base model to function

## Framework Versions

- **Transformers**: 4.x
- **PEFT**: Latest
- **TRL**: Latest  
- **Unsloth**: Latest
- **PyTorch**: 2.x
- **CUDA**: 12.x
Downloads last month
4
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for AdamDS/qwen3-security-dpo-4b

Base model

Qwen/Qwen3-4B-Base
Finetuned
Qwen/Qwen3-4B
Adapter
(10)
this model

Dataset used to train AdamDS/qwen3-security-dpo-4b