monostate-model-fa6e20a4
This model is a fine-tuned version of unsloth/gemma-3-270m-it.
Model Description
This model was fine-tuned using the Monostate training platform with LoRA (Low-Rank Adaptation) for efficient training.
Training Details
Training Data
- Dataset size: 162 samples
- Training type: Supervised Fine-Tuning (SFT)
Training Procedure
Training Hyperparameters
- Training regime: Mixed precision (fp16)
- Optimizer: AdamW
- LoRA rank: 128
- LoRA alpha: 128
- Target modules: q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
Training Results
- Final loss: 1.132578730583191
- Training time: 0.63 minutes
- Generated on: 2025-08-24T13:22:39.258180
Usage
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
# Load model and tokenizer
model = AutoModelForCausalLM.from_pretrained("andrewmonostate/monostate-model-fa6e20a4")
tokenizer = AutoTokenizer.from_pretrained("andrewmonostate/monostate-model-fa6e20a4")
# Generate text
prompt = "Your prompt here"
inputs = tokenizer(prompt, return_tensors="pt")
with torch.no_grad():
outputs = model.generate(
**inputs,
max_new_tokens=256,
temperature=0.7,
do_sample=True,
top_p=0.95,
)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response)
Framework Versions
- Transformers: 4.40+
- PyTorch: 2.0+
- Datasets: 2.0+
- Tokenizers: 0.19+
License
This model is licensed under the Apache 2.0 License.
Citation
If you use this model, please cite:
@misc{andrewmonostate_monostate_model_fa6e20a4,
title={monostate-model-fa6e20a4},
author={Monostate},
year={2024},
publisher={HuggingFace},
url={https://huggingface.co/andrewmonostate/monostate-model-fa6e20a4}
}
Training Platform
This model was trained using Monostate, an AI training and deployment platform.