Text Generation
Safetensors
English
llama
llama3.1
sonnet
claude
conversational
ayan4m1 commited on
Commit
ac9071c
·
verified ·
1 Parent(s): 8824ef3

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +57 -3
README.md CHANGED
@@ -1,3 +1,57 @@
1
- ---
2
- license: llama3.1
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ base_model: meta-llama/Llama-3.1-8B
3
+ datasets:
4
+ - mlfoundations-dev/oh-dcft-v3.1-claude-3-5-sonnet-20241022
5
+ - mpasila/Sonnet3.5-SlimOrcaDedupCleaned-4k-context
6
+ language:
7
+ - en
8
+ license: mit
9
+ pipeline_tag: text-generation
10
+ tags:
11
+ - llama3.1
12
+ - sonnet
13
+ - claude
14
+ quantized_by: ayan4m1
15
+ inference: false
16
+ fine-tuning: true
17
+ ---
18
+
19
+ ## GGUF Quantizations of Llama-3.1-8B Sonnet fine-tuning
20
+
21
+ Using unsloth for fine-tuning and quantization:
22
+
23
+ ```
24
+ ==((====))== Unsloth 2025.2.4: Fast Llama patching. Transformers: 4.48.2.
25
+ \\ /| GPU: NVIDIA A100-SXM4-40GB. Max memory: 39.557 GB. Platform: Linux.
26
+ O^O/ \_/ \ Torch: 2.5.1+cu124. CUDA: 8.0. CUDA Toolkit: 12.4. Triton: 3.1.0
27
+ \ / Bfloat16 = TRUE. FA [Xformers = 0.0.29. FA2 = False]
28
+ "-____-" Free Apache license: http://github.com/unslothai/unsloth
29
+ ```
30
+
31
+ Original model: https://huggingface.co/meta-llama/Llama-3.1-8B
32
+
33
+ Applied open Sonnet datasets containing ~1.2mn question/answer pairs for fine-tuning.
34
+
35
+ ## Prompt format
36
+
37
+ ```
38
+ <|begin_of_text|>{prompt}
39
+ ```
40
+
41
+ ## Download a file (not the whole branch) from below:
42
+
43
+ | Filename | Quant type | File Size | Split | Description |
44
+ | ------------------------------------------------------------------------------------------------------------------------------------ | ---------- | --------- | ----- | ------------------------------------------------------------------- |
45
+ | [Llama-3.1-8B-Sonnet-Q8_0.gguf](https://huggingface.co/ayan4m1/Llama-3.1-8B-Sonnet-GGUF/tree/main/Llama-3.1-8B-Sonnet-Q8_0.gguf) | Q8_0 | 74.98GB | true | Extremely high quality, generally unneeded but max available quant. |
46
+ | [Llama-3.1-8B-Sonnet-Q6_K.gguf](https://huggingface.co/ayan4m1/Llama-3.1-8B-Sonnet-GGUF/tree/main/Llama-3.1-8B-Sonnet-Q6_K.gguf) | Q6_K | 57.89GB | true | Very high quality, near perfect, _recommended_. |
47
+ | [Llama-3.1-8B-Sonnet-Q5_K_M.gguf](https://huggingface.co/ayan4m1/Llama-3.1-8B-Sonnet-GGUF/tree/main/Llama-3.1-8B-Sonnet-Q5_K_M.gguf) | Q5_K_M | 49.95GB | true | High quality, _recommended_. |
48
+ | [Llama-3.1-8B-Sonnet-Q4_K_M.gguf](https://huggingface.co/ayan4m1/Llama-3.1-8B-Sonnet-GGUF/blob/main/Llama-3.1-8B-Sonnet-Q4_K_M.gguf) | Q4_K_M | 42.52GB | false | Good quality, default size for must use cases, _recommended_. |
49
+ | [Llama-3.1-8B-Sonnet-Q3_K_L.gguf](https://huggingface.co/ayan4m1/Llama-3.1-8B-Sonnet-GGUF/blob/main/Llama-3.1-8B-Sonnet-Q3_K_L.gguf) | Q3_K_L | 37.14GB | false | Lower quality but usable, good for low RAM availability. |
50
+ | [Llama-3.1-8B-Sonnet-Q3_K_M.gguf](https://huggingface.co/ayan4m1/Llama-3.1-8B-Sonnet-GGUF/blob/main/Llama-3.1-8B-Sonnet-Q3_K_M.gguf) | Q3_K_M | 30.91GB | false | Lower quality, not recommended. |
51
+ | [Llama-3.1-8B-Sonnet-Q2_K.gguf](https://huggingface.co/ayan4m1/Llama-3.1-8B-Sonnet-GGUF/blob/main/Llama-3.1-8B-Sonnet-Q2_K.gguf) | Q2_K | 26.38GB | false | Very low quality but surprisingly usable. |
52
+
53
+ ## Credits
54
+
55
+ Thanks to Meta and [mlfoundations-dev](https://huggingface.co/mlfoundations-dev) for providing the data used to create this fine-tuning.
56
+
57
+ Thanks to [bartowski](https://huggingface.co/bartowski) for this README template.