base_model: boun-tabi-LMG/TURNA
language:
- tr
license: other
model_creator: boun-tabi-LMG
model_name: TURNA
model_type: t5
prompt_template: '[S2S]prompt<EOS>'
quantized_by: Furkan Erdi
tags:
- GGUF
- Transformers
- TURNA
- t5
library_name: transformers
architecture: t5
inference: false
TURNA - GGUF
- Model creator: boun-tabi-LMG
- Original model: TURNA
Description
This repo contains GGUF format model files for boun-tabi-LMG's TURNA.
About GGUF
GGUF is a new format introduced by the llama.cpp team on August 21st 2023. It is a replacement for GGML, which is no longer supported by llama.cpp.
Here is an incomplete list of clients and libraries that are known to support GGUF:
- llama.cpp. The source project for GGUF. Offers a CLI and a server option.
- text-generation-webui, the most widely used web UI, with many features and powerful extensions. Supports GPU acceleration.
- KoboldCpp, a fully featured web UI, with GPU accel across all platforms and GPU architectures. Especially good for story telling.
- GPT4All, a free and open source local running GUI, supporting Windows, Linux and macOS with full GPU accel.
- LM Studio, an easy-to-use and powerful local GUI for Windows and macOS (Silicon), with GPU acceleration. Linux available, in beta as of 27/11/2023.
- LoLLMS Web UI, a great web UI with many interesting and unique features, including a full model library for easy model selection.
- Faraday.dev, an attractive and easy to use character-based chat GUI for Windows and macOS (both Silicon and Intel), with GPU acceleration.
- llama-cpp-python, a Python library with GPU accel, LangChain support, and OpenAI-compatible API server.
- candle, a Rust ML framework with a focus on performance, including GPU support, and ease of use.
- ctransformers, a Python library with GPU accel, LangChain support, and OpenAI-compatible AI server. Note, as of time of writing (November 27th 2023), ctransformers has not been updated in a long time and does not support many recent models.
Prompt template
[S2S]prompt<EOS>
Compatibility
These quantised GGUFv2 files are compatible with candle from huggingface.
Those models are quantized by candle, cargo using Rust and Python.
Provided files
Name | Bit | Quant Method | Size | Use case |
---|---|---|---|---|
TURNA_Q2K.gguf | 2 | Q2K | 0.36 GB | Smallest size, lowest precision |
TURNA_Q3K.gguf | 3 | Q3K | 0.48 GB | Very low precision |
TURNA_Q4_0.gguf | 4 | Q4_0 | 0.63 GB | Low precision, level 0 |
TURNA_Q4_1.gguf | 4 | Q4_1 | 0.70 GB | Slightly better than Q4_0 |
TURNA_Q4K.gguf | 4 | Q4K | 0.63 GB | Kernel optimized, low precision |
TURNA_Q5_0.gguf | 5 | Q5_0 | 0.77 GB | Moderate precision, level 0 |
TURNA_Q5_1.gguf | 5 | Q5_1 | 0.84 GB | Better than Q5_0 |
TURNA_Q5K.gguf | 5 | Q5K | 0.77 GB | Kernel optimized, moderate precision |
TURNA_Q6K.gguf | 6 | Q6K | 0.91 GB | Higher precision than Q5K |
TURNA_Q8_0.gguf | 8 | Q8_0 | 1.21 GB | High precision, level 0 |
TURNA_Q8_1.gguf | 8 | Q8_1 | 1.29 GB | Better than Q8_0 |
TURNA_Q8K.gguf | 8 | Q8K | 1.30 GB | Kernel optimized, highest precision among quantized |
TURNA_F16.gguf | 16 | F16 | 2.28 GB | High precision, smaller size |
TURNA_F32.gguf | 32 | F32 | 4.57 GB | Highest precision, largest size |
License
The model is shared with the public to be used solely for non-commercial academic research purposes.
How to download GGUF files
Note for manual downloaders: You almost never want to clone the entire repo! Multiple different quantisation formats are provided, and most users only want to pick and download a single file.
The following clients/libraries will automatically download models for you, providing a list of available models to choose from:
On the command line, including multiple files at once
I recommend using the huggingface-hub
Python library:
pip3 install huggingface-hub
Then you can download any individual model file to the current directory, at high speed, with a command like this:
huggingface-cli download helizac/TURNA_GGUF TURNA_Q4_K.gguf --local-dir . --local-dir-use-symlinks False
More advanced huggingface-cli download usage (click to read)
You can also download multiple files at once with a pattern:
huggingface-cli download helizac/TURNA_GGUF --local-dir . --local-dir-use-symlinks False --include='*Q4_K*gguf'
For more documentation on downloading with huggingface-cli
, please see: HF -> Hub Python Library -> Download files -> Download from the CLI.
Example colab
usage
%%shell
# Update and install dependencies
apt update && apt install -y curl build-essential
pip install huggingface_hub
# Install Rust using rustup
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh -s -- -y
# Add Rust to the PATH
source $HOME/.cargo/env
# Cloning Candle from Huggingface
git clone https://github.com/huggingface/candle.git
# Use read CLI or a CLI that has read permissions
huggingface-cli login
%cd candle
def run_turna_gguf(prompt="Bir varmış bir yokmuş, ", temperature=1, quantization_method="Q8_1", config_file="config.json", model_id = "helizac/TURNA_GGUF"):
os.system(f'cargo run --example quantized-t5 --release -- --model-id "{model_id}" --prompt "[S2S]{prompt}<EOS>" --temperature {temperature} --weight-file "TURNA_{quantization_method}.gguf" --config-file "{config_file}"')
run_turna_gguf("Bir varmış bir yokmuş") # test
Sure, here's an explanation for the function run_turna_gguf
:
Function Explanation: run_turna_gguf
def run_turna_gguf(prompt="Bir varmış bir yokmuş, ", temperature=1, quantization_method="Q8_1", config_file="config.json", model_id = "helizac/TURNA_GGUF"):
os.system(f'cargo run --example quantized-t5 --release -- --model-id "{model_id}" --prompt "[S2S]{prompt}<EOS>" --temperature {temperature} --weight-file "TURNA_{quantization_method}.gguf" --config-file "{config_file}"')
Parameters:
- prompt (
str
, default: "Bir varmış bir yokmuş, "):- The initial text provided as input to the model.
- temperature (
float
, default: 1):- Controls the randomness of the output. Higher values make the output more random, while lower values make it more deterministic.
- quantization_method (
str
, default: "Q8_1"):- Specifies the quantization method to use. This selects the corresponding
.gguf
weight file.
- Specifies the quantization method to use. This selects the corresponding
- config_file (
str
, default: "config.json"):- The path to the configuration file containing model-specific settings.
- model_id (
str
, default: "helizac/TURNA_GGUF"):- The identifier for the model in the Hugging Face repository.
Thanks, and how to contribute
Thanks to the boun-tabi-LMG team!
GGUF model card:
{Furkan Erdi}
Original model card: BOUN TABI Language Modeling Group's TURNA
TURNA 🦩
@misc{uludoğan2024turna,
title={TURNA: A Turkish Encoder-Decoder Language Model for Enhanced Understanding and Generation},
author={Gökçe Uludoğan and Zeynep Yirmibeşoğlu Balal and Furkan Akkurt and Melikşah Türker and Onur Güngör and Susan Üsküdarlı},
year={2024},
eprint={2401.14373},
archivePrefix={arXiv},
primaryClass={cs.CL}
}