File size: 2,682 Bytes
ec15f42
 
 
c3b59d9
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
3bd93aa
 
 
c3b59d9
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
---
license: cc-by-nc-4.0
---

# 42dot_LLM-SFT-1.3B_GGUF #

* Model Creator: [42dot](https://huggingface.co/42dot)
* original Model: [42dot_LLM-SFT-1.3B](https://huggingface.co/42dot/42dot_LLM-SFT-1.3B)

## Description ## 

This repository contains the GGUF conversion and the most relevant quantizations 
of 42dot's
[42dot_LLM-SFT-1.3B](https://huggingface.co/42dot/42dot_LLM-SFT-1.3B) model - ready 
to be used with [llama.cpp](https://github.com/ggerganov/llama.cpp) and similar 
applications.

## Files ##

In order to allow for fine-tuning (the model has the required LLaMA architecture)
the original GGUF conversion has been made available

* [42dot_LLM-SFT-1.3B.gguf](https://huggingface.co/rozek/42dot_LLM-SFT-1.3B_GGUF/blob/main/42dot_LLM-SFT-1.3B.gguf)

From this file, the following quantizations were derived:

* [42dot_LLM-SFT-1.3B-Q4_K_M](https://huggingface.co/rozek/42dot_LLM-SFT-1.3B_GGUF/blob/main/42dot_LLM-SFT-1.3B_Q5_K_M.gguf)
* [42dot_LLM-SFT-1.3B-Q5_K_M](https://huggingface.co/rozek/42dot_LLM-SFT-1.3B_GGUF/blob/main/42dot_LLM-SFT-1.3B_Q5_K_M.gguf)
* [42dot_LLM-SFT-1.3B-Q6_K](https://huggingface.co/rozek/42dot_LLM-SFT-1.3B_GGUF/blob/main/42dot_LLM-SFT-1.3B_Q6_0.gguf)
* [42dot_LLM-SFT-1.3B-Q8_K](https://huggingface.co/rozek/42dot_LLM-SFT-1.3B_GGUF/blob/main/42dot_LLM-SFT-1.3B_Q8_0.gguf)

(tell me if you need more)

## Usage Details ##

Any technical details can be found on the
[original model card](https://huggingface.co/42dot/42dot_LLM-SFT-1.3B)
The most important ones for using this model are

* context length is 4096
* there does not seem to be a specific prompt structure - just provide the text
you want to be completed

### Text Completion with LLaMA.cpp ###

For simple inferencing, use a command similar to 

```
./main -m 42dot_LLM-SFT-1.3B-Q8_K.gguf --temp 0 --top-k 4 --prompt "who was Joseph Weizenbaum?"
```

### Text Tokenization with LLaMA.cpp ###

To get a list of tokens, use a command similar to 

```
./tokenization -m 42dot_LLM-SFT-1.3B-Q8_K.gguf --prompt "who was Joseph Weizenbaum?"
```

### Embeddings Calculation with LLaMA.cpp ###

Text embeddings are calculated with a command similar to 

```
./embedding -m 42dot_LLM-SFT-1.3B-Q8_K.gguf --prompt "who was Joseph Weizenbaum?"
```

## License ##

The original model "_is licensed under the Creative Commons 
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)_" - for that reason, the same 
license was also chosen for the conversions found in this repository.

So, in order to be fair and give credits to whom they belong:

* the original model was created and published by [42dot](https://huggingface.co/42dot)
* besides quantization, no changes were applied to the model itself