Product Catalyst SLM

A compact reasoning model (SLM) developed by fine-tuning with the Unsloth library on synthetic data from the Neuralake platform.

Model Description

Product-Catalyst-Phi3-v1 is a highly specialized, instruction-tuned language model designed to function as an expert advisor for product development, management, and launch strategies. It has been fine-tuned from Microsoft's aefficient Phi-3-mini-4k-instruct to adopt the "Product Catalyst" persona.

The primary goal of this model is to provide users with actionable guidance, strategic recommendations, and collaborative support throughout the product lifecycle. It excels at breaking down complex challenges in market research, software development, and business strategy into clear, understandable steps. The model maintains a tone that is innovative, results-driven, and strategic, while remaining approachable and collaborative.

Model Details

Finetuned from: microsoft/Phi-3-mini-4k-instruct. The fine-tuning process was performed on the 4-bit quantized version provided by Unsloth (unsloth/Phi-3-mini-4k-instruct-bnb-4bit).
Finetuned by: Nicole Sarvasi
Model type: Causal Decoder-Only Transformer
Language(s): English
License: Apache 2.0
Related Models: microsoft/Phi-3-mini-4k-instruct

How to Use

This model is a QLoRA adapter and must be loaded using the Unsloth library to ensure correct behavior and leverage performance optimizations.

from unsloth import FastLanguageModel
import torch

# Load the fine-tuned model from the Hugging Face Hub
model, tokenizer = FastLanguageModel.from_pretrained(
    model_name = "nicolesarvasicosta/product-catalyst-phi3-v1",
)

# Set the chat template and enable Unsloth's fast inference
tokenizer = get_chat_template(
    tokenizer,
    chat_template = "phi-3",
)
FastLanguageModel.for_inference(model)

# --- Define the persona and ask a question ---
system_prompt = "You are Product Catalyst, an expert in product development and launch, leveraging knowledge in product management, market research, software development, and business strategy to help organizations build and launch successful products with guaranteed market fit and results. As a trusted advisor, your role is to provide actionable guidance, strategic recommendations, and collaborative support to drive product success..." # Abridged for example
user_question = "What are the first three steps I should take when validating a new B2B SaaS idea?"

messages = [
    {"role": "system", "content": system_prompt},
    {"role": "user", "content": user_question},
]

# --- Tokenize and Generate a Response ---
inputs = tokenizer.apply_chat_template(
    messages,
    tokenize = True,
    add_generation_prompt = True,
    return_tensors = "pt",
).to("cuda")

outputs = model.generate(input_ids=inputs, max_new_tokens=512, use_cache=True)
response_text = tokenizer.batch_decode(outputs)[0]

# --- Decode and Print the Clean Response ---
assistant_marker = "<|assistant|>"
assistant_response_start = response_text.rfind(assistant_marker)
if assistant_response_start != -1:
    print(response_text[assistant_response_start + len(assistant_marker):].strip())
else:
    print(response_text)

Hardware and Performance

This model is a Small Language Model (SLM), specifically designed for high efficiency. Thanks to its compact size and the 4-bit quantization applied during training, it has very accessible computational requirements for inference.

GPU VRAM: This model can be run for inference on a GPU with approximately 6-8 GB of VRAM.
Cloud Platforms: It is perfectly suitable for free-tier Google Colab notebooks (using a T4 GPU with 16 GB VRAM) and similar cloud GPU instances.
Local Machines: It can be run locally on a wide range of consumer NVIDIA GPUs, such as the RTX 4050 series and above.

Using the Unsloth library is recommended for inference to leverage its memory and speed optimizations, ensuring the best performance on consumer-grade hardware. While CPU inference is technically possible with offloading, it would be significantly slower.

Training Details

Frameworks

The model was fine-tuned using Unsloth, which enables significantly faster training and lower memory usage by leveraging optimized Triton kernels. The training script was built on Hugging Face's transformers, peft (for LoRA), and trl (SFTTrainer) libraries.

Dataset

Source: The model was trained on a proprietary, high-quality synthetic dataset generated on the Neuralake platform.
Size: 1,456 instruction-response pairs.
Content: Each data record was structured in a chat format, containing a detailed system prompt that defines the model's persona, a user query posing a complex product-related challenge, and a detailed assistant response that exemplifies expert-level advice.

Training Procedure

The model underwent Supervised Fine-Tuning (SFT) using the QLoRA technique for high parameter efficiency.

Quantization: The base Phi-3 model was quantized to 4-bit NormalFloat (NF4) using bitsandbytes, with bfloat16 as the compute data type.

Hyperparameters:

Hyperparameter	Value
`learning_rate`	2e-4
`num_train_epochs`	2
`per_device_train_batch_size`	2
`gradient_accumulation_steps`	4 (Effective Batch Size: 8)
`optimizer`	AdamW (8-bit)
`lr_scheduler_type`	`linear`
`lora_r` (Rank)	16
`lora_alpha`	32
`lora_dropout`	0.05
`max_seq_length`	2048

Training Results: The training process converged successfully, reaching a final training loss of 0.449.

Evaluation

The model was not evaluated on standardized academic benchmarks. Its performance was assessed through qualitative analysis of its responses to a curated set of prompts covering various aspects of the product lifecycle. The model demonstrates strong instruction-following capabilities, consistent adherence to its "Product Catalyst" persona, and the ability to generate structured, coherent, and contextually relevant advice.

Ethical Considerations & Limitations

Factual Accuracy: This model is a generative tool and can produce inaccurate or fabricated information ("hallucinate"). It should be used as an assistive tool for brainstorming and guidance, not as a sole source of truth. All critical advice should be verified by a human expert.
Bias: The model may reflect biases present in its original training data (from the Phi-3 base model) and the synthetic dataset from the Neuralake platform. It may not represent a global or fully inclusive perspective on business and product strategy.
Scope Limitation: The model's expertise is intentionally narrow. It is designed to excel at product-related topics and will not perform well on out-of-scope queries (e.g., medical advice, personal finance, creative writing).
Intended Use: This model is intended to augment the capabilities of product managers, entrepreneurs, and students. It is not a replacement for professional human judgment and should not be used to make final, high-stakes business decisions without human oversight.

Citation

@software{nicole_sarvasi_2025_product_catalyst,
  author = {Nicole Sarvasi},
  title = {Product-Catalyst-Phi3-v1: A Fine-Tuned Expert on Product Development},
  month = {July},
  year = {2025},
  url = {[https://huggingface.co/nicolesarvasicosta/product-catalyst-phi3-v1](https://huggingface.co/nicolesarvasicosta/product-catalyst-phi3-v1)}
}

This mistral model was trained 2x faster with Unsloth and Huggingface's TRL library.

nicolesarvasicosta
/

product-catalyst-phi3-v1