|
--- |
|
license: cc-by-nc-4.0 |
|
--- |
|
|
|
# 42dot_LLM-SFT-1.3B_GGUF # |
|
|
|
* Model Creator: [42dot](https://huggingface.co/42dot) |
|
* original Model: [42dot_LLM-SFT-1.3B](https://huggingface.co/42dot/42dot_LLM-SFT-1.3B) |
|
|
|
## Description ## |
|
|
|
This repository contains the GGUF conversion and the most relevant quantizations |
|
of 42dot's |
|
[42dot_LLM-SFT-1.3B](https://huggingface.co/42dot/42dot_LLM-SFT-1.3B) model - ready |
|
to be used with [llama.cpp](https://github.com/ggerganov/llama.cpp) and similar |
|
applications. |
|
|
|
## Files ## |
|
|
|
In order to allow for fine-tuning (the model has the required LLaMA architecture) |
|
the original GGUF conversion has been made available |
|
|
|
* [42dot_LLM-SFT-1.3B.gguf](https://huggingface.co/rozek/42dot_LLM-SFT-1.3B_GGUF/blob/main/42dot_LLM-SFT-1.3B.gguf) |
|
|
|
From this file, the following quantizations were derived: |
|
|
|
* [42dot_LLM-SFT-1.3B-Q4_K_M](https://huggingface.co/rozek/42dot_LLM-SFT-1.3B_GGUF/blob/main/42dot_LLM-SFT-1.3B_Q5_K_M.gguf) |
|
* [42dot_LLM-SFT-1.3B-Q5_K_M](https://huggingface.co/rozek/42dot_LLM-SFT-1.3B_GGUF/blob/main/42dot_LLM-SFT-1.3B_Q5_K_M.gguf) |
|
* [42dot_LLM-SFT-1.3B-Q6_K](https://huggingface.co/rozek/42dot_LLM-SFT-1.3B_GGUF/blob/main/42dot_LLM-SFT-1.3B_Q6_0.gguf) |
|
* [42dot_LLM-SFT-1.3B-Q8_K](https://huggingface.co/rozek/42dot_LLM-SFT-1.3B_GGUF/blob/main/42dot_LLM-SFT-1.3B_Q8_0.gguf) |
|
|
|
(tell me if you need more) |
|
|
|
## Usage Details ## |
|
|
|
Any technical details can be found on the |
|
[original model card](https://huggingface.co/42dot/42dot_LLM-SFT-1.3B) |
|
The most important ones for using this model are |
|
|
|
* context length is 4096 |
|
* there does not seem to be a specific prompt structure - just provide the text |
|
you want to be completed |
|
|
|
### Text Completion with LLaMA.cpp ### |
|
|
|
For simple inferencing, use a command similar to |
|
|
|
``` |
|
./main -m 42dot_LLM-SFT-1.3B-Q8_K.gguf --temp 0 --top-k 4 --prompt "who was Joseph Weizenbaum?" |
|
``` |
|
|
|
### Text Tokenization with LLaMA.cpp ### |
|
|
|
To get a list of tokens, use a command similar to |
|
|
|
``` |
|
./tokenization -m 42dot_LLM-SFT-1.3B-Q8_K.gguf --prompt "who was Joseph Weizenbaum?" |
|
``` |
|
|
|
### Embeddings Calculation with LLaMA.cpp ### |
|
|
|
Text embeddings are calculated with a command similar to |
|
|
|
``` |
|
./embedding -m 42dot_LLM-SFT-1.3B-Q8_K.gguf --prompt "who was Joseph Weizenbaum?" |
|
``` |
|
|
|
## License ## |
|
|
|
The original model "_is licensed under the Creative Commons |
|
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)_" - for that reason, the same |
|
license was also chosen for the conversions found in this repository. |
|
|
|
So, in order to be fair and give credits to whom they belong: |
|
|
|
* the original model was created and published by [42dot](https://huggingface.co/42dot) |
|
* besides quantization, no changes were applied to the model itself |
|
|