Gemma 3 (1B) JSON E-commerce Intent Recognizer

This repository contains a fine-tuned version of google/gemma-3-1b-it, specialized for recognizing user intent from e-commerce related queries and outputting a structured JSON object.

The model is designed to analyze a user's request against a provided product catalog and determine whether the user wants to add or remove an item, identifying the specific product and the desired quantity.

This model was fine-tuned using the QLoRA method for high efficiency.

Model Description

  • Base Model: google/gemma-3-1b-it
  • Fine-tuning Method: QLoRA (4-bit Quantized Low-Rank Adaptation)
  • Task: Text-to-JSON for E-commerce Intent Recognition
  • Output Format: A clean JSON object with action, product, and quantity keys.

How to Use

This model expects input in a specific chat format. You must provide a Catalog of available items followed by a User query.

Installation

First, make sure you have the necessary libraries installed:

pip install optimum auto-gptq transformers

Inference Code

Here is a sample Python snippet to run inference with this model.

import torch
from transformers import AutoTokenizer, AutoModelForCausalLM

model_id = "jtlicardo/gemma-3-1b-ecommerce-intent-gptq" 

tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    device_map="auto",
    torch_dtype=torch.bfloat16 # or torch.float16
)

# --- Create your prompt ---
catalog = """Catalog:
Shampoo (400ml bottle)
Hand Soap (250ml dispenser)
Peanut Butter (340g jar)
Headphones
Green Tea (25 tea bags)"""

user_query = "Could you please take off 4 pairs of headphons from my cart?"

# --- Format the prompt using Gemma's chat template ---
# The tokenizer handles this automatically
chat = [
    { "role": "user", "content": f"{catalog}\n\nUser:\n{user_query}" },
]
prompt = tokenizer.apply_chat_template(chat, tokenize=False, add_generation_prompt=True)

# --- Run inference ---
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(
    **inputs,
    max_new_tokens=50,
    do_sample=False # Use do_sample=False for deterministic output
)

# Decode and print the result
result = tokenizer.decode(outputs[0][inputs.input_ids.shape[1]:], skip_special_tokens=True)

print("--- Model Output ---")
print(result)

# --- Expected Output ---
# {"action": "remove", "product": "Headphones", "quantity": 4}

Fine-tuning Details

The model was fine-tuned for 3 epochs on a custom dataset of 100 examples. A 90/10 train/validation split was used to monitor for overfitting and select the best-performing checkpoint.

Downloads last month
0
Safetensors
Model size
396M params
Tensor type
I32
·
F16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for jtlicardo/gemma-3-1b-ecommerce-intent-gptq

Quantized
(100)
this model

Dataset used to train jtlicardo/gemma-3-1b-ecommerce-intent-gptq

Collection including jtlicardo/gemma-3-1b-ecommerce-intent-gptq