learn-abc/html-model-tinyllama-chat-bnb-4bit-F32-GGUF

This LoRA adapter was converted to GGUF format from learn-abc/html-model-tinyllama-chat-bnb-4bit via the ggml.ai's GGUF-my-lora space. Refer to the original adapter repository for more details.

Fine-tuned TinyLlama for JSON Extraction (GGUF)

This repository contains a fine-tuned version of the unsloth/tinyllama-chat-bnb-4bit model, specifically trained for extracting product information from HTML snippets and outputting it in a JSON format. This is the GGUF quantized version for use with tools like llama.cpp or other compatible inference engines.

Model Details

Base Model: learn-abc/html-model-tinyllama-chat-bnb-4bit
Fine-tuning Method: LoRA (Low-Rank Adaptation)
Quantization: q4_k_m GGUF
Trained on: A custom dataset of HTML product snippets and their corresponding JSON representations.

Usage

This model can be used for tasks involving structured data extraction from HTML content using GGUF compatible software.

Downloading and using the GGUF file

You can download the GGUF file directly from the "Files and versions" tab on this repository page.

To use this file with llama.cpp, you generally follow these steps:

Download llama.cpp: Clone the llama.cpp repository and build it. Follow the instructions in the llama.cpp README for building on your specific platform.

Use with llama.cpp

# with cli
llama-cli -m base_model.gguf --lora html-model-tinyllama-chat-bnb-4bit-f32.gguf (...other args)

# with server
llama-server -m base_model.gguf --lora html-model-tinyllama-chat-bnb-4bit-f32.gguf (...other args)

Use python script

Install llama.cpp

pip install llama-cpp-python

Python script to run the model

from llama_cpp import Llama

# Replace with the actual path to your downloaded GGUF file
model_path = "/path/to/your/downloaded/html-model-tinyllama-chat-bnb-4bit-F32-GGUF.gguf"

llm = Llama(model_path=model_path)

prompt = "Extract the product information:\n<div class='product'><h2>iPad Air</h2><span class='price'>$1344</span><span class='category'>audio</span><span class='brand'>Dell</span></div>"

output = llm(prompt, max_tokens=256, temperature=0.7)

print(output["choices"][0]["text"])

To know more about LoRA usage with llama.cpp server, refer to the llama.cpp server documentation.

learn-abc
/

html-model-tinyllama-chat-bnb-4bit-F32-GGUF