SmolLM2-360M JSON E-commerce Intent Recognizer

This repository contains a fine-tuned version of HuggingFaceTB/SmolLM2-360M-Instruct. This lightweight model has been specialized for recognizing user intent from e-commerce related queries and outputting a structured JSON object.

The model is designed to analyze a user's request against a provided product catalog and determine whether the user wants to add or remove an item, identifying the specific product and the desired quantity.

This model was fine-tuned using the QLoRA method for maximum efficiency, making it an excellent choice for resource-constrained environments.

Model Description

  • Base Model: HuggingFaceTB/SmolLM2-360M-Instruct
  • Fine-tuning Method: QLoRA (4-bit Quantized Low-Rank Adaptation)
  • Task: Text-to-JSON for E-commerce Intent Recognition
  • Output Format: A clean JSON object with action, product, and quantity keys.

This model serves as a highly efficient, lightweight alternative for this specific task.

Performance and Use Case

This model was trained as part of an experiment to compare performance versus model size. While it successfully performs the JSON generation task, its accuracy and robustness are significantly lower than its larger counterpart, gemma-3-1b-ecommerce-intent-gptq, which achieved a much lower validation loss.

This SmolLM2-360M version is best suited for:

  • Environments with extreme memory or compute constraints.
  • Educational purposes and demonstrating the fine-tuning process.
  • Applications where lower accuracy is an acceptable trade-off for speed and size.

How to Use

The model expects input in the chat format defined by the base model. You must provide a Catalog of available items followed by a User query.

Installation

First, make sure you have the necessary libraries installed:

pip install -q optimum auto-gptq transformers

Inference Code

Here is a sample Python snippet to run inference with this model.

import torch
from transformers import AutoTokenizer, AutoModelForCausalLM, GPTQConfig

model_id = "jtlicardo/smollm2-360m-ecommerce-intent-gptq" 

tokenizer = AutoTokenizer.from_pretrained(model_id)

model = AutoModelForCausalLM.from_pretrained(
    model_id,
    device_map="auto",
)

# --- Create your prompt ---
catalog = """Catalog:
Shampoo (400ml bottle)
Hand Soap (250ml dispenser)
Peanut Butter (340g jar)
Headphones
Green Tea (25 tea bags)"""

user_query = "Remove 3 shampoos"

# --- Format the prompt using the model's chat template ---
# The tokenizer handles this automatically
chat = [
    { "role": "user", "content": f"{catalog}\n\nUser:\n{user_query}" },
]
prompt = tokenizer.apply_chat_template(chat, tokenize=False, add_generation_prompt=True)

# --- Run inference ---
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)

outputs = model.generate(
    **inputs,
    max_new_tokens=50,
    do_sample=False # Use do_sample=False for deterministic output
)

# Decode and print the result
result = tokenizer.decode(outputs[0][inputs.input_ids.shape[1]:], skip_special_tokens=True)

print("--- Model Output ---")
print(result)

# --- Expected Output ---
# {"action": "remove", "product": "Shampoo (400ml bottle)", "quantity": 3}

Fine-tuning Details

The model was fine-tuned on a custom dataset of 100 examples. A 90/10 train/validation split was used to monitor for overfitting, and the best checkpoint was selected which yielded the lowest validation loss.

Downloads last month
0
Safetensors
Model size
87.1M params
Tensor type
I32
·
F16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for jtlicardo/smollm2-360m-ecommerce-intent-gptq

Quantized
(78)
this model

Dataset used to train jtlicardo/smollm2-360m-ecommerce-intent-gptq

Collection including jtlicardo/smollm2-360m-ecommerce-intent-gptq