--- base_model: meta-llama/Llama-3.1-8B datasets: - mlfoundations-dev/oh-dcft-v3.1-claude-3-5-sonnet-20241022 - mpasila/Sonnet3.5-SlimOrcaDedupCleaned-4k-context language: - en license: mit pipeline_tag: text-generation tags: - llama3.1 - sonnet - claude - unsloth quantized_by: ayan4m1 inference: false fine-tuning: true --- ## GGUF Quantizations of Llama-3.1-8B Sonnet fine-tuning Using unsloth for fine-tuning and quantization: ``` ==((====))== Unsloth 2025.2.4: Fast Llama patching. Transformers: 4.48.2. \\ /| GPU: NVIDIA A100-SXM4-40GB. Max memory: 39.557 GB. Platform: Linux. O^O/ \_/ \ Torch: 2.5.1+cu124. CUDA: 8.0. CUDA Toolkit: 12.4. Triton: 3.1.0 \ / Bfloat16 = TRUE. FA [Xformers = 0.0.29. FA2 = False] "-____-" Free Apache license: http://github.com/unslothai/unsloth ``` Original model: https://huggingface.co/meta-llama/Llama-3.1-8B Applied open Sonnet datasets containing ~1.2mn question/answer pairs for fine-tuning. ## Prompt format ``` <|begin_of_text|>{prompt} ``` ## Download a file (not the whole branch) from below: | Filename | Quant type | File Size | Split | Description | | ------------------------------------------------------------------------------------------------------------------------------------ | ---------- | --------- | ----- | ------------------------------------------------------------------- | | [Llama-3.1-8B-Sonnet-Q8_0.gguf](https://huggingface.co/ayan4m1/Llama-3.1-8B-Sonnet-GGUF/tree/main/Llama-3.1-8B-Sonnet-Q8_0.gguf) | Q8_0 | 74.98GB | true | Extremely high quality, generally unneeded but max available quant. | | [Llama-3.1-8B-Sonnet-Q6_K.gguf](https://huggingface.co/ayan4m1/Llama-3.1-8B-Sonnet-GGUF/tree/main/Llama-3.1-8B-Sonnet-Q6_K.gguf) | Q6_K | 57.89GB | true | Very high quality, near perfect, _recommended_. | | [Llama-3.1-8B-Sonnet-Q5_K_M.gguf](https://huggingface.co/ayan4m1/Llama-3.1-8B-Sonnet-GGUF/tree/main/Llama-3.1-8B-Sonnet-Q5_K_M.gguf) | Q5_K_M | 49.95GB | true | High quality, _recommended_. | | [Llama-3.1-8B-Sonnet-Q4_K_M.gguf](https://huggingface.co/ayan4m1/Llama-3.1-8B-Sonnet-GGUF/blob/main/Llama-3.1-8B-Sonnet-Q4_K_M.gguf) | Q4_K_M | 42.52GB | false | Good quality, default size for must use cases, _recommended_. | | [Llama-3.1-8B-Sonnet-Q3_K_L.gguf](https://huggingface.co/ayan4m1/Llama-3.1-8B-Sonnet-GGUF/blob/main/Llama-3.1-8B-Sonnet-Q3_K_L.gguf) | Q3_K_L | 37.14GB | false | Lower quality but usable, good for low RAM availability. | | [Llama-3.1-8B-Sonnet-Q3_K_M.gguf](https://huggingface.co/ayan4m1/Llama-3.1-8B-Sonnet-GGUF/blob/main/Llama-3.1-8B-Sonnet-Q3_K_M.gguf) | Q3_K_M | 30.91GB | false | Lower quality, not recommended. | | [Llama-3.1-8B-Sonnet-Q2_K.gguf](https://huggingface.co/ayan4m1/Llama-3.1-8B-Sonnet-GGUF/blob/main/Llama-3.1-8B-Sonnet-Q2_K.gguf) | Q2_K | 26.38GB | false | Very low quality but surprisingly usable. | ## Credits Thanks to Meta and [mlfoundations-dev](https://huggingface.co/mlfoundations-dev) for providing the data used to create this fine-tuning. Thanks to [bartowski](https://huggingface.co/bartowski) for this README template.