File size: 2,868 Bytes
c77e3a9
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
ee632aa
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
c77e3a9
 
 
 
 
 
 
 
 
 
ee632aa
 
57eb066
ee632aa
 
 
57eb066
ee632aa
 
 
 
57eb066
ee632aa
 
 
 
 
 
 
 
 
 
c77e3a9
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
---
base_model: learn-abc/html-model-tinyllama-chat-bnb-4bit
tags:
- text-generation-inference
- transformers
- unsloth
- llama
- trl
- llama-cpp
- gguf-my-lora
license: apache-2.0
language:
- en
---

# learn-abc/html-model-tinyllama-chat-bnb-4bit-F32-GGUF
This LoRA adapter was converted to GGUF format from [`learn-abc/html-model-tinyllama-chat-bnb-4bit`](https://huggingface.co/learn-abc/html-model-tinyllama-chat-bnb-4bit) via the ggml.ai's [GGUF-my-lora](https://huggingface.co/spaces/ggml-org/gguf-my-lora) space.
Refer to the [original adapter repository](https://huggingface.co/learn-abc/html-model-tinyllama-chat-bnb-4bit) for more details.

# Fine-tuned TinyLlama for JSON Extraction (GGUF)

This repository contains a fine-tuned version of the `unsloth/tinyllama-chat-bnb-4bit` model, specifically trained for extracting product information from HTML snippets and outputting it in a JSON format. This is the GGUF quantized version for use with tools like `llama.cpp` or other compatible inference engines.

## Model Details

- **Base Model:** `learn-abc/html-model-tinyllama-chat-bnb-4bit`
- **Fine-tuning Method:** LoRA (Low-Rank Adaptation)
- **Quantization:** q4_k_m GGUF
- **Trained on:** A custom dataset of HTML product snippets and their corresponding JSON representations.

## Usage

This model can be used for tasks involving structured data extraction from HTML content using GGUF compatible software.

### Downloading and using the GGUF file

You can download the GGUF file directly from the "Files and versions" tab on this repository page.

To use this file with `llama.cpp`, you generally follow these steps:

1.  **Download `llama.cpp`:** Clone the `llama.cpp` repository and build it. Follow the instructions in the `llama.cpp` README for building on your specific platform.

## Use with llama.cpp

```bash
# with cli
llama-cli -m base_model.gguf --lora html-model-tinyllama-chat-bnb-4bit-f32.gguf (...other args)

# with server
llama-server -m base_model.gguf --lora html-model-tinyllama-chat-bnb-4bit-f32.gguf (...other args)
```


## Use python script
### Install llama.cpp
```bash
pip install llama-cpp-python
```
### Python script to run the model
```python
from llama_cpp import Llama

# Replace with the actual path to your downloaded GGUF file
model_path = "/path/to/your/downloaded/html-model-tinyllama-chat-bnb-4bit-F32-GGUF.gguf"

llm = Llama(model_path=model_path)

prompt = "Extract the product information:\n<div class='product'><h2>iPad Air</h2><span class='price'>$1344</span><span class='category'>audio</span><span class='brand'>Dell</span></div>"

output = llm(prompt, max_tokens=256, temperature=0.7)

print(output["choices"][0]["text"])
```

To know more about LoRA usage with llama.cpp server, refer to the [llama.cpp server documentation](https://github.com/ggerganov/llama.cpp/blob/master/examples/server/README.md).