--- license: apache-2.0 language: - es tags: - falcon-fine-tune - gguf - llama.cpp - lince-zero-quantized model_name: LINCE-ZERO base_model: clibrain/lince-zero inference: false model_creator: Clibrain model_type: falcon pipeline_tag: text-generation prompt_template: > A continuación hay una instrucción que describe una tarea, junto con una entrada que proporciona más contexto. Escriba una respuesta que complete adecuadamente la solicitud.\n\n### Instrucción: {prompt}\n\n### Respuesta: quantized_by: alvarobartt --- # Model Card for LINCE-ZERO-7B-GGUF [LINCE-ZERO](https://huggingface.co/clibrain/lince-zero) is a fine-tuned LLM for instruction following of [Falcon 7B](https://huggingface.co/tiiuae/falcon-7b). The team/org leading the fine-tune is [Clibrain](https://huggingface.co/clibrain), and the datasets used are both [Alpaca](https://huggingface.co/datasets/tatsu-lab/alpaca) and [Dolly](https://huggingface.co/datasets/databricks/databricks-dolly-15k) datasets, both translated into Spanish and augmented to 80k examples (as Clibrain claims in its [model card](https://huggingface.co/clibrain/lince-zero#model-card-for-lince-zero)). This model contains the quantized variants using the GGUF format, introduced by the [llama.cpp](https://github.com/ggerganov/llama.cpp) team. Some curious may ask, why don't you just use [TheBloke/lince-zero-GGUF](https://huggingface.co/TheBloke/lince-zero-GGUF)? Well, you can use those via `llama.cpp` to run inference over LINCE-ZERO on low resources, but in case you want to use it via [LM Studio](https://lmstudio.ai/) in MacOS you will encounter some issues, as it may only work with `q4_k_s`, `q4_k_m`, `q5_k_s`, and `q5_k_m` quantization formats, and those are not included in TheBloke's. ## Model Details ### Model Description - **Model type:** Falcon - **Fine-tuned from model:** [Falcon 7B](https://huggingface.co/tiiuae/falcon-7b) - **Created by**: [TIIUAE](https://huggingface.co/tiiuae) - **Fine-tuned by:** [Clibrain](https://huggingface.co/clibrain) - **Quantized by:** [alvarobartt](https://huggingface.co/alvarobartt) - **Language(s) (NLP):** Spanish - **License:** Apache 2.0 (disclaimer: there may be some licensing mismatch see https://huggingface.co/clibrain/lince-zero/discussions/5) ### Model Sources - **Repository:** [LINCE-ZERO](https://huggingface.co/clibrain/lince-zero) ### Model Files | Name | Quant method | Bits | Size | Max RAM required | Use case | | ---- | ---- | ---- | ---- | ---- | ----- | | [lince-zero-7b-q4_k_s.gguf](https://huggingface.co/alvarobartt/lince-zero-7b-GGUF/blob/main/lince-zero-7b-q4_k_s.gguf) | Q4_K_S | 4 | 7.41 GB| 9.91 GB | small, greater quality loss | | [lince-zero-7b-q4_k_m.gguf](https://huggingface.co/alvarobartt/lince-zero-7b-GGUF/blob/main/lince-zero-7b-q4_k_m.gguf) | Q4_K_M | 4 | 7.87 GB| 10.37 GB | medium, balanced quality - recommended | | [lince-zero-7b-q5_k_s.gguf](https://huggingface.co/alvarobartt/lince-zero-7b-GGUF/blob/main/lince-zero-7b-q5_k_s.gguf) | Q5_K_S | 5 | 8.97 GB| 11.47 GB | large, low quality loss - recommended | | [lince-zero-7b-q5_k_m.gguf](https://huggingface.co/alvarobartt/lince-zero-7b-GGUF/blob/main/lince-zero-7b-q5_k_m.gguf) | Q5_K_M | 5 | 9.23 GB| 11.73 GB | large, very low quality loss - recommended | **Note**: the above RAM figures assume no GPU offloading. If layers are offloaded to the GPU, this will reduce RAM usage and use VRAM instead. ## Uses ### Direct Use [More Information Needed] ### Downstream Use [optional] [More Information Needed] ### Recommendations Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. ## How to Get Started with the Model Use the code below to get started with the model. [More Information Needed] ## Training Details All the training details can be found at [Falcon 7B - Training Details](https://huggingface.co/tiiuae/falcon-7b#training-details), and the fine-tuning details at [LINCE-ZERO - Training Details](https://huggingface.co/clibrain/lince-zero#%F0%9F%93%9A-training-details).