Qwen3-0.6B AI Content Detector (LoRA)

Model Description

This is a LoRA (Low-Rank Adaptation) fine-tuned version of Qwen3-0.6B-Base for AI-generated content detection. The model is trained to classify text as either human-written (class 0) or AI-generated (class 1) using the RAID dataset.

Model Details

  • Base Model: Qwen/Qwen3-0.6B-Base
  • Fine-tuning Method: LoRA (Low-Rank Adaptation)
  • Task: Binary text classification (Human vs AI content detection)
  • Dataset: RAID Dataset (train_none.csv)
  • Training Framework: Unsloth + Transformers
  • Model Type: Parameter-efficient fine-tuning adapter

Training Details

Dataset

  • Source: RAID Dataset for AI content detection
  • Training Samples: 24,000 (balanced: 12,000 human + 12,000 AI)
  • Validation Samples: 2,000 (balanced: 1,000 human + 1,000 AI)
  • Class Balance: 50% Human (class 0) / 50% AI (class 1)

Training Configuration

  • LoRA Rank: 16
  • LoRA Alpha: 16
  • Target Modules: q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
  • Learning Rate: 1e-4
  • Batch Size: 2 per device
  • Epochs: 1
  • Optimizer: AdamW 8-bit
  • Max Sequence Length: 2048

Hardware

  • GPU: Tesla T4 (Google Colab)
  • Precision: FP16
  • Memory Optimization: Gradient checkpointing enabled

Usage

Loading the Model

from unsloth import FastLanguageModel
import torch

# Load base model first
model, tokenizer = FastLanguageModel.from_pretrained(
    model_name="subhashbs36/qwen3-0.6-ai-detector-merged",
    max_seq_length=4096,
    dtype=torch.float16,
    load_in_4bit=False,
)

# Load your LoRA adapter
# model.load_adapter("subhashbs36/qwen3-0.6-ai-detector-lora")

# Enable inference mode
FastLanguageModel.for_inference(model)
import os
import torch
import torch.nn.functional as F

# Enable CUDA debugging for accurate stack trace
# os.environ['CUDA_LAUNCH_BLOCKING'] = '1'

def classify_text_fixed(text_sample):
    prompt = f"""Here is a text sample:
{text_sample}

Classify this text into one of the following:
class 0: Human
class 1: AI

SOLUTION
The correct answer is: class """
    
    inputs = tokenizer(prompt, return_tensors="pt")
    device = next(model.parameters()).device
    inputs = {k: v.to(device) for k, v in inputs.items()}
    
    with torch.no_grad():
        outputs = model(**inputs)
        
        # Fix: Get the last token index as a scalar, not tensor
        last_token_idx = (inputs['attention_mask'].sum(1) - 1).item()
        last_logits = outputs.logits[0, last_token_idx, :]
        
        # Debug information
        print(f"Logits shape: {last_logits.shape}")
        print(f"Number token ids: {number_token_ids}")
        print(f"Vocab size: {last_logits.shape[0]}")
        
        # Check if any index is out of bounds
        vocab_size = last_logits.shape[0]
        for i, idx in enumerate(number_token_ids):
            if idx >= vocab_size:
                print(f"ERROR: Index {idx} (class {i}) is out of bounds for vocab size {vocab_size}")
                return None, None
        
        probs_all = F.softmax(last_logits, dim=-1)
        probs = probs_all[number_token_ids]
        predicted_class = torch.argmax(probs).item()
        confidence = probs[predicted_class].item()
    
    return predicted_class, confidence

Performance

  • Task: Binary classification (Human vs AI content detection)
  • Classes:
    • Class 0: Human-written content
    • Class 1: AI-generated content
  • Evaluation: Tested on balanced validation set from RAID dataset

Limitations

  • Trained specifically on RAID dataset distribution
  • Performance may vary on out-of-domain text
  • Designed for English text classification
  • Requires specific prompt format for optimal performance

Technical Implementation

This model uses a custom approach with:

  • Reduced vocabulary: Only uses token IDs for classes 0 and 1
  • Custom data collator: Trains only on the last token of sequences
  • Token mapping: Maps original vocabulary to reduced classification head
  • Parameter-efficient training: Uses LoRA for efficient fine-tuning

Citation

If you use this model in your research, please cite:

@misc{qwen3-ai-detector-2025,
  title={Qwen3-0.6B AI Content Detector},
  author={subhashbs36},
  year={2025},
  howpublished={Hugging Face Model Hub},
  url={https://huggingface.co/subhashbs36/qwen3-0.6-ai-detector-lora}
}

License

This model is released under the Apache 2.0 license, following the base model's licensing terms.

Acknowledgments

  • Built using Unsloth for efficient training
  • Based on Qwen3-0.6B-Base by Alibaba Cloud
  • Trained on RAID dataset for AI content detection research
  • Utilizes LoRA (Low-Rank Adaptation) for parameter-efficient fine-tuning
Downloads last month
30
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for subhashbs36/qwen3-0.6-ai-detector-merged

Adapter
(4)
this model

Dataset used to train subhashbs36/qwen3-0.6-ai-detector-merged