rozek
/

42dot_LLM-SFT-1.3B_GGUF

Inference Endpoints

Model card Files Files and versions Community

42dot_LLM-SFT-1.3B_GGUF / README.md

rozek's picture

Update README.md

3bd93aa over 1 year ago

|

history blame contribute delete

2.68 kB

	---
	license: cc-by-nc-4.0
	---

	# 42dot_LLM-SFT-1.3B_GGUF #

	* Model Creator: [42dot](https://huggingface.co/42dot)
	* original Model: [42dot_LLM-SFT-1.3B](https://huggingface.co/42dot/42dot_LLM-SFT-1.3B)

	## Description ##

	This repository contains the GGUF conversion and the most relevant quantizations
	of 42dot's
	[42dot_LLM-SFT-1.3B](https://huggingface.co/42dot/42dot_LLM-SFT-1.3B) model - ready
	to be used with [llama.cpp](https://github.com/ggerganov/llama.cpp) and similar
	applications.

	## Files ##

	In order to allow for fine-tuning (the model has the required LLaMA architecture)
	the original GGUF conversion has been made available

	* [42dot_LLM-SFT-1.3B.gguf](https://huggingface.co/rozek/42dot_LLM-SFT-1.3B_GGUF/blob/main/42dot_LLM-SFT-1.3B.gguf)

	From this file, the following quantizations were derived:

	* [42dot_LLM-SFT-1.3B-Q4_K_M](https://huggingface.co/rozek/42dot_LLM-SFT-1.3B_GGUF/blob/main/42dot_LLM-SFT-1.3B_Q5_K_M.gguf)
	* [42dot_LLM-SFT-1.3B-Q5_K_M](https://huggingface.co/rozek/42dot_LLM-SFT-1.3B_GGUF/blob/main/42dot_LLM-SFT-1.3B_Q5_K_M.gguf)
	* [42dot_LLM-SFT-1.3B-Q6_K](https://huggingface.co/rozek/42dot_LLM-SFT-1.3B_GGUF/blob/main/42dot_LLM-SFT-1.3B_Q6_0.gguf)
	* [42dot_LLM-SFT-1.3B-Q8_K](https://huggingface.co/rozek/42dot_LLM-SFT-1.3B_GGUF/blob/main/42dot_LLM-SFT-1.3B_Q8_0.gguf)

	(tell me if you need more)

	## Usage Details ##

	Any technical details can be found on the
	[original model card](https://huggingface.co/42dot/42dot_LLM-SFT-1.3B)
	The most important ones for using this model are

	* context length is 4096
	* there does not seem to be a specific prompt structure - just provide the text
	you want to be completed

	### Text Completion with LLaMA.cpp ###

	For simple inferencing, use a command similar to

	```
	./main -m 42dot_LLM-SFT-1.3B-Q8_K.gguf --temp 0 --top-k 4 --prompt "who was Joseph Weizenbaum?"
	```

	### Text Tokenization with LLaMA.cpp ###

	To get a list of tokens, use a command similar to

	```
	./tokenization -m 42dot_LLM-SFT-1.3B-Q8_K.gguf --prompt "who was Joseph Weizenbaum?"
	```

	### Embeddings Calculation with LLaMA.cpp ###

	Text embeddings are calculated with a command similar to

	```
	./embedding -m 42dot_LLM-SFT-1.3B-Q8_K.gguf --prompt "who was Joseph Weizenbaum?"
	```

	## License ##

	The original model "_is licensed under the Creative Commons
	Attribution-NonCommercial 4.0 (CC BY-NC 4.0)_" - for that reason, the same
	license was also chosen for the conversions found in this repository.

	So, in order to be fair and give credits to whom they belong:

	* the original model was created and published by [42dot](https://huggingface.co/42dot)
	* besides quantization, no changes were applied to the model itself