Product Catalyst SLM
A compact reasoning model (SLM) developed by fine-tuning with the Unsloth library on synthetic data from the Neuralake platform.
Model Description
Product-Catalyst-Phi3-v1 is a highly specialized, instruction-tuned language model designed to function as an expert advisor for product development, management, and launch strategies. It has been fine-tuned from Microsoft's aefficient Phi-3-mini-4k-instruct
to adopt the "Product Catalyst" persona.
The primary goal of this model is to provide users with actionable guidance, strategic recommendations, and collaborative support throughout the product lifecycle. It excels at breaking down complex challenges in market research, software development, and business strategy into clear, understandable steps. The model maintains a tone that is innovative, results-driven, and strategic, while remaining approachable and collaborative.
Model Details
- Finetuned from:
microsoft/Phi-3-mini-4k-instruct
. The fine-tuning process was performed on the 4-bit quantized version provided by Unsloth (unsloth/Phi-3-mini-4k-instruct-bnb-4bit
). - Finetuned by: Nicole Sarvasi
- Model type: Causal Decoder-Only Transformer
- Language(s): English
- License: Apache 2.0
- Related Models:
microsoft/Phi-3-mini-4k-instruct
How to Use
This model is a QLoRA adapter and must be loaded using the Unsloth library to ensure correct behavior and leverage performance optimizations.
from unsloth import FastLanguageModel
import torch
# Load the fine-tuned model from the Hugging Face Hub
model, tokenizer = FastLanguageModel.from_pretrained(
model_name = "nicolesarvasicosta/product-catalyst-phi3-v1",
)
# Set the chat template and enable Unsloth's fast inference
tokenizer = get_chat_template(
tokenizer,
chat_template = "phi-3",
)
FastLanguageModel.for_inference(model)
# --- Define the persona and ask a question ---
system_prompt = "You are Product Catalyst, an expert in product development and launch, leveraging knowledge in product management, market research, software development, and business strategy to help organizations build and launch successful products with guaranteed market fit and results. As a trusted advisor, your role is to provide actionable guidance, strategic recommendations, and collaborative support to drive product success..." # Abridged for example
user_question = "What are the first three steps I should take when validating a new B2B SaaS idea?"
messages = [
{"role": "system", "content": system_prompt},
{"role": "user", "content": user_question},
]
# --- Tokenize and Generate a Response ---
inputs = tokenizer.apply_chat_template(
messages,
tokenize = True,
add_generation_prompt = True,
return_tensors = "pt",
).to("cuda")
outputs = model.generate(input_ids=inputs, max_new_tokens=512, use_cache=True)
response_text = tokenizer.batch_decode(outputs)[0]
# --- Decode and Print the Clean Response ---
assistant_marker = "<|assistant|>"
assistant_response_start = response_text.rfind(assistant_marker)
if assistant_response_start != -1:
print(response_text[assistant_response_start + len(assistant_marker):].strip())
else:
print(response_text)
Hardware and Performance
This model is a Small Language Model (SLM), specifically designed for high efficiency. Thanks to its compact size and the 4-bit quantization applied during training, it has very accessible computational requirements for inference.
- GPU VRAM: This model can be run for inference on a GPU with approximately 6-8 GB of VRAM.
- Cloud Platforms: It is perfectly suitable for free-tier Google Colab notebooks (using a T4 GPU with 16 GB VRAM) and similar cloud GPU instances.
- Local Machines: It can be run locally on a wide range of consumer NVIDIA GPUs, such as the RTX 4050 series and above.
Using the Unsloth library is recommended for inference to leverage its memory and speed optimizations, ensuring the best performance on consumer-grade hardware. While CPU inference is technically possible with offloading, it would be significantly slower.
Training Details
Frameworks
The model was fine-tuned using Unsloth, which enables significantly faster training and lower memory usage by leveraging optimized Triton kernels. The training script was built on Hugging Face's transformers
, peft
(for LoRA), and trl
(SFTTrainer
) libraries.
Dataset
- Source: The model was trained on a proprietary, high-quality synthetic dataset generated on the Neuralake platform.
- Size: 1,456 instruction-response pairs.
- Content: Each data record was structured in a chat format, containing a detailed
system
prompt that defines the model's persona, auser
query posing a complex product-related challenge, and a detailedassistant
response that exemplifies expert-level advice.
Training Procedure
The model underwent Supervised Fine-Tuning (SFT) using the QLoRA technique for high parameter efficiency.
- Quantization: The base Phi-3 model was quantized to 4-bit NormalFloat (NF4) using
bitsandbytes
, withbfloat16
as the compute data type. - Hyperparameters:
Hyperparameter Value learning_rate
2e-4 num_train_epochs
2 per_device_train_batch_size
2 gradient_accumulation_steps
4 (Effective Batch Size: 8) optimizer
AdamW (8-bit) lr_scheduler_type
linear
lora_r
(Rank)16 lora_alpha
32 lora_dropout
0.05 max_seq_length
2048 - Training Results: The training process converged successfully, reaching a final training loss of 0.449.
Evaluation
The model was not evaluated on standardized academic benchmarks. Its performance was assessed through qualitative analysis of its responses to a curated set of prompts covering various aspects of the product lifecycle. The model demonstrates strong instruction-following capabilities, consistent adherence to its "Product Catalyst" persona, and the ability to generate structured, coherent, and contextually relevant advice.
Ethical Considerations & Limitations
- Factual Accuracy: This model is a generative tool and can produce inaccurate or fabricated information ("hallucinate"). It should be used as an assistive tool for brainstorming and guidance, not as a sole source of truth. All critical advice should be verified by a human expert.
- Bias: The model may reflect biases present in its original training data (from the Phi-3 base model) and the synthetic dataset from the Neuralake platform. It may not represent a global or fully inclusive perspective on business and product strategy.
- Scope Limitation: The model's expertise is intentionally narrow. It is designed to excel at product-related topics and will not perform well on out-of-scope queries (e.g., medical advice, personal finance, creative writing).
- Intended Use: This model is intended to augment the capabilities of product managers, entrepreneurs, and students. It is not a replacement for professional human judgment and should not be used to make final, high-stakes business decisions without human oversight.
Citation
@software{nicole_sarvasi_2025_product_catalyst,
author = {Nicole Sarvasi},
title = {Product-Catalyst-Phi3-v1: A Fine-Tuned Expert on Product Development},
month = {July},
year = {2025},
url = {[https://huggingface.co/nicolesarvasicosta/product-catalyst-phi3-v1](https://huggingface.co/nicolesarvasicosta/product-catalyst-phi3-v1)}
}
This mistral model was trained 2x faster with Unsloth and Huggingface's TRL library.
Model tree for nicolesarvasicosta/product-catalyst-phi3-v1
Base model
unsloth/Phi-3-mini-4k-instruct-bnb-4bit