Gemma 3 (1B) JSON E-commerce Intent Recognizer
This repository contains a fine-tuned version of google/gemma-3-1b-it
, specialized for recognizing user intent from e-commerce related queries and outputting a structured JSON object.
The model is designed to analyze a user's request against a provided product catalog and determine whether the user wants to add
or remove
an item, identifying the specific product
and the desired quantity
.
This model was fine-tuned using the QLoRA method for high efficiency.
Model Description
- Base Model: google/gemma-3-1b-it
- Fine-tuning Method: QLoRA (4-bit Quantized Low-Rank Adaptation)
- Task: Text-to-JSON for E-commerce Intent Recognition
- Output Format: A clean JSON object with
action
,product
, andquantity
keys.
How to Use
This model expects input in a specific chat format. You must provide a Catalog
of available items followed by a User
query.
Installation
First, make sure you have the necessary libraries installed:
pip install optimum auto-gptq transformers
Inference Code
Here is a sample Python snippet to run inference with this model.
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM
model_id = "jtlicardo/gemma-3-1b-ecommerce-intent-gptq"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
model_id,
device_map="auto",
torch_dtype=torch.bfloat16 # or torch.float16
)
# --- Create your prompt ---
catalog = """Catalog:
Shampoo (400ml bottle)
Hand Soap (250ml dispenser)
Peanut Butter (340g jar)
Headphones
Green Tea (25 tea bags)"""
user_query = "Could you please take off 4 pairs of headphons from my cart?"
# --- Format the prompt using Gemma's chat template ---
# The tokenizer handles this automatically
chat = [
{ "role": "user", "content": f"{catalog}\n\nUser:\n{user_query}" },
]
prompt = tokenizer.apply_chat_template(chat, tokenize=False, add_generation_prompt=True)
# --- Run inference ---
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(
**inputs,
max_new_tokens=50,
do_sample=False # Use do_sample=False for deterministic output
)
# Decode and print the result
result = tokenizer.decode(outputs[0][inputs.input_ids.shape[1]:], skip_special_tokens=True)
print("--- Model Output ---")
print(result)
# --- Expected Output ---
# {"action": "remove", "product": "Headphones", "quantity": 4}
Fine-tuning Details
The model was fine-tuned for 3 epochs on a custom dataset of 100 examples. A 90/10 train/validation split was used to monitor for overfitting and select the best-performing checkpoint.
- Downloads last month
- 0