TURNA_GGUF / README.md

Update README.md

da70942 verified 12 months ago

9.54 kB

	---
	base_model: boun-tabi-LMG/TURNA
	language:
	- tr
	license: other
	model_creator: boun-tabi-LMG
	model_name: TURNA
	model_type: t5
	prompt_template: '[S2S]prompt<EOS>'
	quantized_by: Furkan Erdi
	tags:
	- GGUF
	- Transformers
	- TURNA
	- t5
	library_name: transformers
	architecture: t5
	inference: false
	---

	# TURNA - GGUF
	- Model creator: [boun-tabi-LMG](https://huggingface.co/boun-tabi-LMG)
	- Original model: [TURNA](https://huggingface.co/boun-tabi-LMG/TURNA)

	<!-- description start -->
	## Description

	This repo contains GGUF format model files for [boun-tabi-LMG's TURNA](https://huggingface.co/boun-tabi-LMG/TURNA).

	<!-- description end -->
	<!-- README_GGUF.md-about-gguf start -->
	### About GGUF

	GGUF is a new format introduced by the llama.cpp team on August 21st 2023. It is a replacement for GGML, which is no longer supported by llama.cpp.

	Here is an incomplete list of clients and libraries that are known to support GGUF:

	* [llama.cpp](https://github.com/ggerganov/llama.cpp). The source project for GGUF. Offers a CLI and a server option.
	* [text-generation-webui](https://github.com/oobabooga/text-generation-webui), the most widely used web UI, with many features and powerful extensions. Supports GPU acceleration.
	* [KoboldCpp](https://github.com/LostRuins/koboldcpp), a fully featured web UI, with GPU accel across all platforms and GPU architectures. Especially good for story telling.
	* [GPT4All](https://gpt4all.io/index.html), a free and open source local running GUI, supporting Windows, Linux and macOS with full GPU accel.
	* [LM Studio](https://lmstudio.ai/), an easy-to-use and powerful local GUI for Windows and macOS (Silicon), with GPU acceleration. Linux available, in beta as of 27/11/2023.
	* [LoLLMS Web UI](https://github.com/ParisNeo/lollms-webui), a great web UI with many interesting and unique features, including a full model library for easy model selection.
	* [Faraday.dev](https://faraday.dev/), an attractive and easy to use character-based chat GUI for Windows and macOS (both Silicon and Intel), with GPU acceleration.
	* [llama-cpp-python](https://github.com/abetlen/llama-cpp-python), a Python library with GPU accel, LangChain support, and OpenAI-compatible API server.
	* [candle](https://github.com/huggingface/candle), a Rust ML framework with a focus on performance, including GPU support, and ease of use.
	* [ctransformers](https://github.com/marella/ctransformers), a Python library with GPU accel, LangChain support, and OpenAI-compatible AI server. Note, as of time of writing (November 27th 2023), ctransformers has not been updated in a long time and does not support many recent models.

	<!-- README_GGUF.md-about-gguf end -->

	<!-- prompt-template start -->
	## Prompt template

	```
	[S2S]prompt<EOS>
	```

	<!-- prompt-template end -->


	<!-- compatibility_gguf start -->
	## Compatibility

	These quantised GGUFv2 files are compatible with candle from huggingface.

	Those models are quantized by candle, cargo using Rust and Python.

	<!-- compatibility_gguf end -->

	<!-- README_GGUF.md-provided-files start -->
	## Provided files

	\| Name \| Bit \| Quant Method \| Size \| Use case \|
	\| ---- \| ---- \| ---- \| ---- \| ---- \|
	\| [TURNA_Q2K.gguf](https://huggingface.co/helizac/TURNA_GGUF/blob/main/TURNA_Q2K.gguf) \| 2 \| Q2K \| 0.36 GB \| Smallest size, lowest precision \|
	\| [TURNA_Q3K.gguf](https://huggingface.co/helizac/TURNA_GGUF/blob/main/TURNA_Q3K.gguf) \| 3 \| Q3K \| 0.48 GB \| Very low precision \|
	\| [TURNA_Q4_0.gguf](https://huggingface.co/helizac/TURNA_GGUF/blob/main/TURNA_Q4_0.gguf) \| 4 \| Q4_0 \| 0.63 GB \| Low precision, level 0 \|
	\| [TURNA_Q4_1.gguf](https://huggingface.co/helizac/TURNA_GGUF/blob/main/TURNA_Q4_1.gguf) \| 4 \| Q4_1 \| 0.70 GB \| Slightly better than Q4_0 \|
	\| [TURNA_Q4K.gguf](https://huggingface.co/helizac/TURNA_GGUF/blob/main/TURNA_Q4K.gguf) \| 4 \| Q4K \| 0.63 GB \| Kernel optimized, low precision \|
	\| [TURNA_Q5_0.gguf](https://huggingface.co/helizac/TURNA_GGUF/blob/main/TURNA_Q5_0.gguf) \| 5 \| Q5_0 \| 0.77 GB \| Moderate precision, level 0 \|
	\| [TURNA_Q5_1.gguf](https://huggingface.co/helizac/TURNA_GGUF/blob/main/TURNA_Q5_1.gguf) \| 5 \| Q5_1 \| 0.84 GB \| Better than Q5_0 \|
	\| [TURNA_Q5K.gguf](https://huggingface.co/helizac/TURNA_GGUF/blob/main/TURNA_Q5K.gguf) \| 5 \| Q5K \| 0.77 GB \| Kernel optimized, moderate precision \|
	\| [TURNA_Q6K.gguf](https://huggingface.co/helizac/TURNA_GGUF/blob/main/TURNA_Q6K.gguf) \| 6 \| Q6K \| 0.91 GB \| Higher precision than Q5K \|
	\| [TURNA_Q8_0.gguf](https://huggingface.co/helizac/TURNA_GGUF/blob/main/TURNA_Q8_0.gguf) \| 8 \| Q8_0 \| 1.21 GB \| High precision, level 0 \|
	\| [TURNA_Q8_1.gguf](https://huggingface.co/helizac/TURNA_GGUF/blob/main/TURNA_Q8_1.gguf) \| 8 \| Q8_1 \| 1.29 GB \| Better than Q8_0 \|
	\| [TURNA_Q8K.gguf](https://huggingface.co/helizac/TURNA_GGUF/blob/main/TURNA_Q8K.gguf) \| 8 \| Q8K \| 1.30 GB \| Kernel optimized, highest precision among quantized \|
	\| [TURNA_F16.gguf](https://huggingface.co/helizac/TURNA_GGUF/blob/main/TURNA_F16.gguf) \| 16 \| F16 \| 2.28 GB \| High precision, smaller size \|
	\| [TURNA_F32.gguf](https://huggingface.co/helizac/TURNA_GGUF/blob/main/TURNA_F32.gguf) \| 32 \| F32 \| 4.57 GB \| Highest precision, largest size \|

	<!-- README_GGUF.md-provided-files end -->

	# License

	The model is shared with the public to be used solely for non-commercial academic research purposes.

	<!-- README_GGUF.md-how-to-download start -->

	## How to download GGUF files

	Note for manual downloaders: You almost never want to clone the entire repo! Multiple different quantisation formats are provided, and most users only want to pick and download a single file.

	The following clients/libraries will automatically download models for you, providing a list of available models to choose from:

	### On the command line, including multiple files at once

	I recommend using the `huggingface-hub` Python library:

	```shell
	pip3 install huggingface-hub
	```

	Then you can download any individual model file to the current directory, at high speed, with a command like this:

	```shell
	huggingface-cli download helizac/TURNA_GGUF TURNA_Q4_K.gguf --local-dir . --local-dir-use-symlinks False
	```

	<details>
	<summary>More advanced huggingface-cli download usage (click to read)</summary>

	You can also download multiple files at once with a pattern:

	```shell
	huggingface-cli download helizac/TURNA_GGUF --local-dir . --local-dir-use-symlinks False --include='Q4_Kgguf'
	```

	For more documentation on downloading with `huggingface-cli`, please see: [HF -> Hub Python Library -> Download files -> Download from the CLI](https://huggingface.co/docs/huggingface_hub/guides/download#download-from-the-cli).

	</details>

	<!-- README_GGUF.md-how-to-download end -->

	<!-- README_GGUF.md-how-to-run start -->


	# Example `colab` usage

	```shell
	%%shell
	# Update and install dependencies
	apt update && apt install -y curl build-essential
	pip install huggingface_hub

	# Install Rust using rustup
	curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs \| sh -s -- -y

	# Add Rust to the PATH
	source $HOME/.cargo/env

	# Cloning Candle from Huggingface
	git clone https://github.com/huggingface/candle.git

	# Use read CLI or a CLI that has read permissions
	huggingface-cli login
	```

	```shell
	%cd candle
	```

	```python
	def run_turna_gguf(prompt="Bir varmış bir yokmuş, ", temperature=1, quantization_method="Q8_1", config_file="config.json", model_id = "helizac/TURNA_GGUF"):
	os.system(f'cargo run --example quantized-t5 --release -- --model-id "{model_id}" --prompt "[S2S]{prompt}<EOS>" --temperature {temperature} --weight-file "TURNA_{quantization_method}.gguf" --config-file "{config_file}"')
	```

	```python
	run_turna_gguf("Bir varmış bir yokmuş") # test
	```

	Sure, here's an explanation for the function `run_turna_gguf`:

	### Function Explanation: `run_turna_gguf`

	```python
	def run_turna_gguf(prompt="Bir varmış bir yokmuş, ", temperature=1, quantization_method="Q8_1", config_file="config.json", model_id = "helizac/TURNA_GGUF"):
	os.system(f'cargo run --example quantized-t5 --release -- --model-id "{model_id}" --prompt "[S2S]{prompt}<EOS>" --temperature {temperature} --weight-file "TURNA_{quantization_method}.gguf" --config-file "{config_file}"')
	```

	#### Parameters:
	- prompt (`str`, default: "Bir varmış bir yokmuş, "):
	- The initial text provided as input to the model.
	- temperature (`float`, default: 1):
	- Controls the randomness of the output. Higher values make the output more random, while lower values make it more deterministic.
	- quantization_method (`str`, default: "Q8_1"):
	- Specifies the quantization method to use. This selects the corresponding `.gguf` weight file.
	- config_file (`str`, default: "config.json"):
	- The path to the configuration file containing model-specific settings.
	- model_id (`str`, default: "helizac/TURNA_GGUF"):
	- The identifier for the model in the Hugging Face repository.

	## Thanks, and how to contribute

	Thanks to the [boun-tabi-LMG](https://github.com/boun-tabi-LMG) team!

	<!-- footer end -->

	# GGUF model card:

	```
	{Furkan Erdi}
	```

	<!-- original-model-card start -->
	# Original model card: BOUN TABI Language Modeling Group's TURNA

	TURNA 🦩

	```
	@misc{uludoğan2024turna,
	title={TURNA: A Turkish Encoder-Decoder Language Model for Enhanced Understanding and Generation},
	author={Gökçe Uludoğan and Zeynep Yirmibeşoğlu Balal and Furkan Akkurt and Melikşah Türker and Onur Güngör and Susan Üsküdarlı},
	year={2024},
	eprint={2401.14373},
	archivePrefix={arXiv},
	primaryClass={cs.CL}
	}
	```