learn-abc/html-model-tinyllama-chat-bnb-4bit-F32-GGUF
This LoRA adapter was converted to GGUF format from learn-abc/html-model-tinyllama-chat-bnb-4bit
via the ggml.ai's GGUF-my-lora space.
Refer to the original adapter repository for more details.
Fine-tuned TinyLlama for JSON Extraction (GGUF)
This repository contains a fine-tuned version of the unsloth/tinyllama-chat-bnb-4bit
model, specifically trained for extracting product information from HTML snippets and outputting it in a JSON format. This is the GGUF quantized version for use with tools like llama.cpp
or other compatible inference engines.
Model Details
- Base Model:
learn-abc/html-model-tinyllama-chat-bnb-4bit
- Fine-tuning Method: LoRA (Low-Rank Adaptation)
- Quantization: q4_k_m GGUF
- Trained on: A custom dataset of HTML product snippets and their corresponding JSON representations.
Usage
This model can be used for tasks involving structured data extraction from HTML content using GGUF compatible software.
Downloading and using the GGUF file
You can download the GGUF file directly from the "Files and versions" tab on this repository page.
To use this file with llama.cpp
, you generally follow these steps:
- Download
llama.cpp
: Clone thellama.cpp
repository and build it. Follow the instructions in thellama.cpp
README for building on your specific platform.
Use with llama.cpp
# with cli
llama-cli -m base_model.gguf --lora html-model-tinyllama-chat-bnb-4bit-f32.gguf (...other args)
# with server
llama-server -m base_model.gguf --lora html-model-tinyllama-chat-bnb-4bit-f32.gguf (...other args)
Use python script
Install llama.cpp
pip install llama-cpp-python
Python script to run the model
from llama_cpp import Llama
# Replace with the actual path to your downloaded GGUF file
model_path = "/path/to/your/downloaded/html-model-tinyllama-chat-bnb-4bit-F32-GGUF.gguf"
llm = Llama(model_path=model_path)
prompt = "Extract the product information:\n<div class='product'><h2>iPad Air</h2><span class='price'>$1344</span><span class='category'>audio</span><span class='brand'>Dell</span></div>"
output = llm(prompt, max_tokens=256, temperature=0.7)
print(output["choices"][0]["text"])
To know more about LoRA usage with llama.cpp server, refer to the llama.cpp server documentation.
- Downloads last month
- 8
32-bit
Model tree for learn-abc/html-model-tinyllama-chat-bnb-4bit-F32-GGUF
Base model
unsloth/tinyllama-chat-bnb-4bit