unsloth/Llama-3.2-1B-Instruct-GGUF · Regarding Fine-Tuning Before Quantization of GGUF Models

Hello, I’ve been using some of your quantized GGUF models and greatly appreciate your work in making them available to the community. I had a quick question regarding your quantization process:

Do you fine-tune the base model in any way prior to performing the quantization (e.g., for q4_K, q2_K, etc.), or are the models quantized directly from the original pretrained weights?

Understanding this would help me better evaluate the performance trade-offs and potential use cases for these models.

Thank you in advance for your time, and again, I appreciate the valuable contributions you're making!