Model Card for LogLaye-LLaMA3.2B-QLoRA-finetuned-HDFS-logs
Model Details
Model Description
This model is a fine-tuned version of LLaMA 3.2B using QLoRA for log anomaly detection on HDFS (Hadoop Distributed File System) logs.
- Developed by: Abdoulaye MBAYE (FatLab, Fathala IT, ZigZeug)
- Finetuned from model: meta-llama/LLaMA-3.2B
- Language(s): English (Log data text)
- License: llama3 original license (non-commercial research only)
- Model type: Causal Language Model (Instruction-tuned for classification)
Model Sources
- Repository: https://huggingface.co/ZigZeug/LogLaye-LLaMA3.2B-QLoRA-finetuned-HDFS-logs
- Dataset: https://huggingface.co/datasets/ZigZeug/HDFS-logs-cleaned-chatml
Uses
Direct Use
This model classifies HDFS logs into:
"normal"
→ Expected system behavior."anomalous"
→ Suspicious or error-prone system behavior.
Downstream Use
- Infrastructure log monitoring
- Automated ML-based observability
- Large scale system supervision
Out-of-Scope Use
- Not designed for non-HDFS log formats.
- Not suitable for general-purpose natural language tasks.
Bias, Risks, and Limitations
- The model was trained on pre-processed HDFS logs; unknown behavior may occur with logs from different systems.
- The model doesn’t explain why an anomaly happens — only predicts classification.
Recommendations
Always keep human supervision when deploying anomaly detection models in production.
How to Get Started with the Model
from transformers import AutoTokenizer, AutoModelForCausalLM
model_id = "ZigZeug/LogLaye-LLaMA3.2B-QLoRA-finetuned-HDFS-logs"
model = AutoModelForCausalLM.from_pretrained(model_id, device_map="auto", torch_dtype="auto")
tokenizer = AutoTokenizer.from_pretrained(model_id)
prompt = [
{"role": "system", "content": "You are an expert in HDFS log analysis. Classify if the following log is normal or anomalous."},
{"role": "user", "content": "Log: PacketResponder 2 for block blk_-3552845605773916309 terminating"}
]
chat = tokenizer.apply_chat_template(prompt, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(chat, return_tensors="pt").to("cuda")
outputs = model.generate(**inputs, max_new_tokens=20)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support