German Phishing Detection
Collection
6 items
โข
Updated
axolotl version: 0.8.0.dev0
base_model: NousResearch/Nous-Hermes-2-Mistral-7B-DPO
load_in_8bit: true
load_in_4bit: false
bf16: auto
gradient_checkpointing: true
sequence_len: 4096
max_prompt_len: 512
tokenizer_use_fast: true
adapter: lora
lora_r: 8
lora_alpha: 16
lora_dropout: 0.05
lora_target_modules:
- q_proj
- v_proj
- k_proj
- o_proj
- gate_proj
- down_proj
- up_proj
datasets:
- path: AndyAT/Phishing_indicators
type: alpaca
train_file: train_alpaca.jsonl
validation_file: val_alpaca.jsonl
test_file: test_alpaca.jsonl
trust_remote_code: true
dataset_processes: 32
val_set_size: 0.001
shuffle_merged_datasets: true
num_epochs: 1.0
micro_batch_size: 2
gradient_accumulation_steps: 32
optimizer: adamw_bnb_8bit
learning_rate: 0.0002
lr_scheduler: cosine
weight_decay: 0.0
output_dir: ./outputs/mistral7b_phishing
save_strategy: steps
save_steps: 100
save_total_limit: 3
save_safetensors: true
evaluation_strategy: steps
eval_steps: 100
load_best_model_at_end: true
logging_steps: 10
trl:
use_vllm: false
train_on_inputs: false
group_by_length: true
seed: 42
This model is a fine-tuned version of NousResearch/Nous-Hermes-2-Mistral-7B-DPO on the AndyAT/Phishing_indicators dataset. It achieves the following results on the evaluation set:
More information needed
More information needed
More information needed
The following hyperparameters were used during training:
Training Loss | Epoch | Step | Validation Loss |
---|---|---|---|
No log | 0.0033 | 1 | 1.2508 |
0.347 | 0.3333 | 100 | 0.4167 |
0.3203 | 0.6665 | 200 | 0.3208 |
0.3171 | 0.9998 | 300 | 0.2807 |
4-bit
5-bit
16-bit
Base model
mistralai/Mistral-7B-v0.1