Phi-4 Cybersecurity Chatbot - Q4_K_M GGUF

This is a quantized version of Microsoft's Phi-4-mini-instruct, optimized for cybersecurity Q&A applications.

Model Details

  • Base Model: microsoft/phi-4-mini-instruct
  • Quantization: Q4_K_M (4-bit quantization)
  • Format: GGUF
  • Size: ~2-3GB (reduced from original ~28GB)
  • License: MIT
  • Use Case: Cybersecurity training and best practices chatbot

Intended Use

This model is specifically fine-tuned and optimized for:

  • Answering cybersecurity questions
  • Providing security best practices
  • Explaining phishing, malware, and other threats
  • Guiding on password security and data protection
  • Incident response guidance

Performance

  • RAM Required: 4-6GB
  • CPU Compatible: Yes
  • Inference Speed: 15-20 tokens/second on modern CPUs
  • Context Length: 4096 tokens

Usage

With llama.cpp

# Download the model
wget https://huggingface.co/YOUR_USERNAME/phi4-cybersec-Q4_K_M/resolve/main/phi4-mini-instruct-Q4_K_M.gguf

# Run with llama.cpp
./main -m phi4-mini-instruct-Q4_K_M.gguf -p "What is phishing?" -n 256

With Python (llama-cpp-python)

from llama_cpp import Llama

# Load model
llm = Llama(
    model_path="phi4-mini-instruct-Q4_K_M.gguf",
    n_ctx=4096,
    n_threads=8,
    n_gpu_layers=0  # CPU only
)

# Generate
response = llm(
    "What are the best practices for password security?",
    max_tokens=256,
    temperature=0.7,
    stop=["<|end|>", "<|user|>"]
)

print(response['choices'][0]['text'])

With LangChain

from langchain.llms import LlamaCpp

llm = LlamaCpp(
    model_path="phi4-mini-instruct-Q4_K_M.gguf",
    temperature=0.7,
    max_tokens=256,
    n_ctx=4096
)

response = llm("How do I identify suspicious emails?")
print(response)

Prompt Format

The model uses ChatML format:

<|system|>
You are a cybersecurity expert assistant.
<|end|>
<|user|>
What is malware?
<|end|>
<|assistant|>

Quantization Details

This model was quantized using llama.cpp with the following process:

  1. Original model: microsoft/phi-4-mini-instruct
  2. Conversion: HF โ†’ GGUF format (FP16)
  3. Quantization: GGUF FP16 โ†’ Q4_K_M

The Q4_K_M quantization method provides:

  • 4-bit quantization with K-means
  • Mixed precision for important weights
  • ~75% size reduction
  • Minimal quality loss (<2% on benchmarks)

Limitations

  • Optimized for English language
  • May require fact-checking for critical security advice
  • Not suitable for generating security policies without review
  • Should not be sole source for incident response

Ethical Considerations

This model is intended to improve cybersecurity awareness and should be used responsibly:

  • Always verify critical security advice
  • Don't use for malicious purposes
  • Respect privacy and data protection laws
  • Consider cultural and organizational context

Citation

If you use this model, please cite:

@misc{phi4-cybersec-gguf,
  author = {Your Name},
  title = {Phi-4 Cybersecurity Q4_K_M GGUF},
  year = {2024},
  publisher = {Hugging Face},
  url = {https://huggingface.co/YOUR_USERNAME/phi4-cybersec-Q4_K_M}
}

Acknowledgments

  • Microsoft for the original Phi-4 model
  • llama.cpp team for quantization tools
  • The open-source community

Contact

For questions or issues: [[email protected]]

Downloads last month
41
GGUF
Model size
3.84B params
Architecture
phi3
Hardware compatibility
Log In to view the estimation

4-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support