🌾 LLaMA Late Blight Classifier (Huancavelica, Peru)

This model is a fine-tuned classifier based on openlm-research/open_llama_3b, trained to predict potato late blight risk levels (Bajo, Moderado, Alto) in the highlands of Huancavelica, Peru. It uses environmental inputs (temperature, humidity, precipitation) and crop variety metadata to output discrete classifications.


🀝 Use Case

Direct Use: Agronomic advisory systems or research tools predicting potato late blight risk from structured prompts or API queries.

Not for: Open-ended generation, conversational use, or regions with different pathogen pressures without retraining.


🌐 Model Details

  • Base model: openlm-research/open_llama_3b
  • Architecture: LLaMA-3B with classification head (AutoModelForSequenceClassification)
  • Fine-tuning method: Full fine-tuning on a balanced, curated dataset (not LoRA)
  • Tokenizer: Compatible LLaMA tokenizer (tokenizer.model included)
  • Language: Spanish (with structured Spanish prompts)
  • Task: Hard classification (3-class)

πŸŽ“ Training

  • Dataset: 156 training + 24 validation examples (balanced across 3 classes)
  • Labels: Bajo, Moderado, Alto
  • Format (JSONL):
    {
      "instruction": "EvalΓΊa el riesgo de tizΓ³n tardΓ­o basado en los datos climΓ‘ticos y la variedad.",
      "input": "Escenario 1: Temperatura promedio 17.2 Β°C, Humedad 83%, PrecipitaciΓ³n 3.4 mm, Variedad Yungay",
      "output": "Moderado"
    }
    
  • Epochs: 10
  • Optimizer: AdamW (mixed precision)
  • Hardware: 1x A100 40GB (Colab Pro, single GPU)

🌿 Evaluation (Balanced Test Set, n = 90)

Class Precision Recall F1 Support
Bajo 1.00 0.90 0.95 30
Moderado 0.91 1.00 0.95 30
Alto 1.00 1.00 1.00 30
Accuracy 0.97 90

πŸ“ˆ Intended Use and Limitations

  • Designed for: Highland regions in Peru (esp. Huancavelica), with expert-labeled ground truth and local pathogen behavior.
  • Limitations:
    • May generalize poorly to lowland areas or different varieties.
    • Not a substitute for in-field disease monitoring.

πŸ“‘ Citation

If you use this model, please cite:

Jorge Luis Alonso, Predicting Potato Late Blight in Huancavelica Using LLaMA Models, 2025


🌍 License

MIT License (model + training data)


⚑ Quick Inference Example

from transformers import AutoModelForSequenceClassification, AutoTokenizer, pipeline
model = AutoModelForSequenceClassification.from_pretrained("jalonso24/llama-lateblight-classifier")
tokenizer = AutoTokenizer.from_pretrained("jalonso24/llama-lateblight-classifier")
clf = pipeline("text-classification", model=model, tokenizer=tokenizer, top_k=1)

prompt = "Escenario: Temperatura 18.1 Β°C, Humedad 85%, Variedad Amarilis"
clf(prompt)
# ➞ [{'label': 'Alto', 'score': 0.95}]
Downloads last month
21
Safetensors
Model size
3.32B params
Tensor type
F32
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Evaluation results

  • Accuracy on Huancavelica Late Blight Benchmark (Balanced)
    self-reported
    0.970
  • F1 (macro) on Huancavelica Late Blight Benchmark (Balanced)
    self-reported
    0.970
  • Precision on Huancavelica Late Blight Benchmark (Balanced)
    self-reported
    0.970
  • Recall on Huancavelica Late Blight Benchmark (Balanced)
    self-reported
    0.970