|
--- |
|
base_model: learn-abc/html-model-tinyllama-chat-bnb-4bit |
|
tags: |
|
- text-generation-inference |
|
- transformers |
|
- unsloth |
|
- llama |
|
- trl |
|
- llama-cpp |
|
- gguf-my-lora |
|
license: apache-2.0 |
|
language: |
|
- en |
|
--- |
|
|
|
# learn-abc/html-model-tinyllama-chat-bnb-4bit-F32-GGUF |
|
This LoRA adapter was converted to GGUF format from [`learn-abc/html-model-tinyllama-chat-bnb-4bit`](https://huggingface.co/learn-abc/html-model-tinyllama-chat-bnb-4bit) via the ggml.ai's [GGUF-my-lora](https://huggingface.co/spaces/ggml-org/gguf-my-lora) space. |
|
Refer to the [original adapter repository](https://huggingface.co/learn-abc/html-model-tinyllama-chat-bnb-4bit) for more details. |
|
|
|
# Fine-tuned TinyLlama for JSON Extraction (GGUF) |
|
|
|
This repository contains a fine-tuned version of the `unsloth/tinyllama-chat-bnb-4bit` model, specifically trained for extracting product information from HTML snippets and outputting it in a JSON format. This is the GGUF quantized version for use with tools like `llama.cpp` or other compatible inference engines. |
|
|
|
## Model Details |
|
|
|
- **Base Model:** `learn-abc/html-model-tinyllama-chat-bnb-4bit` |
|
- **Fine-tuning Method:** LoRA (Low-Rank Adaptation) |
|
- **Quantization:** q4_k_m GGUF |
|
- **Trained on:** A custom dataset of HTML product snippets and their corresponding JSON representations. |
|
|
|
## Usage |
|
|
|
This model can be used for tasks involving structured data extraction from HTML content using GGUF compatible software. |
|
|
|
### Downloading and using the GGUF file |
|
|
|
You can download the GGUF file directly from the "Files and versions" tab on this repository page. |
|
|
|
To use this file with `llama.cpp`, you generally follow these steps: |
|
|
|
1. **Download `llama.cpp`:** Clone the `llama.cpp` repository and build it. Follow the instructions in the `llama.cpp` README for building on your specific platform. |
|
|
|
## Use with llama.cpp |
|
|
|
```bash |
|
# with cli |
|
llama-cli -m base_model.gguf --lora html-model-tinyllama-chat-bnb-4bit-f32.gguf (...other args) |
|
|
|
# with server |
|
llama-server -m base_model.gguf --lora html-model-tinyllama-chat-bnb-4bit-f32.gguf (...other args) |
|
``` |
|
|
|
|
|
## Use python script |
|
### Install llama.cpp |
|
```bash |
|
pip install llama-cpp-python |
|
``` |
|
### Python script to run the model |
|
```python |
|
from llama_cpp import Llama |
|
|
|
# Replace with the actual path to your downloaded GGUF file |
|
model_path = "/path/to/your/downloaded/html-model-tinyllama-chat-bnb-4bit-F32-GGUF.gguf" |
|
|
|
llm = Llama(model_path=model_path) |
|
|
|
prompt = "Extract the product information:\n<div class='product'><h2>iPad Air</h2><span class='price'>$1344</span><span class='category'>audio</span><span class='brand'>Dell</span></div>" |
|
|
|
output = llm(prompt, max_tokens=256, temperature=0.7) |
|
|
|
print(output["choices"][0]["text"]) |
|
``` |
|
|
|
To know more about LoRA usage with llama.cpp server, refer to the [llama.cpp server documentation](https://github.com/ggerganov/llama.cpp/blob/master/examples/server/README.md). |
|
|