--- base_model: unsloth/Qwen3-1.7B-unsloth-bnb-4bit tags: - text-generation-inference - transformers - unsloth - qwen3 - trl license: apache-2.0 language: - en --- # Automated Pentest Model ## Objective The model is designed to perform **automated penetration testing** by simulating an attack path from reconnaissance to exploitation, leveraging tools such as **nmap**, **searchsploit**, and **msfconsole (Metasploit)**. It aims to exploit vulnerabilities in services like OpenSSH 7.2p2 and assess Redis for unauthenticated access, with the goal of gaining initial access and enabling lateral movement. ## Model Configuration - **Model**: Qwen3-1.7B-unsloth-bnb-4bit - **Framework**: Unsloth with LoRA (Low-Rank Adaptation) for efficient fine-tuning - **Quantization**: 4-bit quantization to optimize memory usage for low-VRAM GPUs (e.g., NVIDIA GTX 1060 3GB) - **Max Sequence Length**: 512 tokens - **LoRA Parameters**: - Rank: 16 - Target Modules: `q_proj`, `k_proj`, `v_proj` - LoRA Alpha: 16 - Dropout: 0 - Bias: None - Gradient Checkpointing: Enabled with "unsloth" for reduced VRAM usage - **Training Arguments**: - Batch Size: 4 per device, with 4 gradient accumulation steps - Learning Rate: 2e-5 - Epochs: 1 (for testing; adjustable for full runs) - Warmup Steps: 5 - Max Steps: 60 - Optimizer: AdamW 8-bit - Weight Decay: 0.01 - LR Scheduler: Linear - Precision: FP16 (fallback to BF16 if supported) - Logging: Every 10 steps, integrated with Weights & Biases (wandb) - Save Strategy: Every 100 steps, limit of 2 checkpoints - Evaluation: Every 50 steps - **Dataset**: - Source: JSON file containing pentest scenarios - Structure: Train (60%), Validation (20%), Test (20%) split - Preprocessing: Generates command-by-command prompts with random questions from `pre_attack_discussion`, variable detail levels (short, medium, detailed), and results from tools like nmap and Metasploit - Format: Prompt-response pairs with EOS token appended - **Tools Used**: - **nmap**: For port scanning and service enumeration (e.g., OpenSSH, Redis) - **searchsploit**: To identify known vulnerabilities in services (e.g., OpenSSH 7.2p2 username enumeration) - **msfconsole (Metasploit)**: For exploit searches and automated attack attempts (e.g., SSH login brute-forcing) ## GPU Support - **GPU Detection**: Checks CUDA availability and GPU name (target: NVIDIA GTX 1060 3GB) - **CUDA Version**: Compatible with CUDA 12.1 - **Optimization**: 4-bit quantization and gradient checkpointing to fit within 3GB VRAM constraints ## Dataset Example - **Goal**: Exploit vulnerabilities in OpenSSH 7.2p2 on port 22 to gain initial access, then assess Redis on port 6383 for unauthenticated data exposure and potential lateral movement. ## Dataset Trained on [Nielzac/cybersecurity-sft-dataset](https://huggingface.co/datasets/Nielzac/cybersecurity-sft-dataset), containing prompts and responses for cybersecurity scenarios, filtered to exclude failed commands. # Uploaded model - **Developed by:** Nielzac - **License:** apache-2.0 - **Finetuned from model :** unsloth/Qwen3-1.7B-unsloth-bnb-4bit This qwen3 model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library. [](https://github.com/unslothai/unsloth)