SetFit with sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2

This is a SetFit model that can be used for Text Classification. This SetFit model uses sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2 as the Sentence Transformer embedding model. A LogisticRegression instance is used for classification.

The model has been trained using an efficient few-shot learning technique that involves:

Fine-tuning a Sentence Transformer with contrastive learning.
Training a classification head with features from the fine-tuned Sentence Transformer.

Model Details

Model Description

Model Type: SetFit
Sentence Transformer body: sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2
Classification head: a LogisticRegression instance
Maximum Sequence Length: 128 tokens
Number of Classes: 21 classes

Model Sources

Repository: SetFit on GitHub
Paper: Efficient Few-Shot Learning Without Prompts
Blogpost: SetFit: Efficient Few-Shot Learning Without Prompts

Model Labels

Label	Examples
EMI	'You guys provide EMI option?' 'Do you offer Zero Percent EMI payment options?' '0% EMI.'
COD	'COD option is availble?' 'Do you offer COD to my pincode?' 'Can I do COD?'
ORTHO_FEATURES	'Features of Ortho mattress' 'What are the key features of the SOF Ortho mattress' 'SOF ortho'
ERGO_FEATURES	'What are the key features of the SOF Ergo mattress' 'Features of Ergo mattress' 'SOF ergo'
COMPARISON	'What is the difference between the Ergo & Ortho variants' 'Difference between Ergo & Ortho Mattress' 'Difference between the products'
WARRANTY	'What is the warranty period?' 'Warranty' 'Does mattress cover is included in warranty'
100_NIGHT_TRIAL_OFFER	'How does the 100 night trial work' 'What is the 100-night offer' 'Trial details'
SIZE_CUSTOMIZATION	'I want to change the size of the mattress.' 'Need some help in changing size of the mattress' 'How can I order a custom sized mattress'
WHAT_SIZE_TO_ORDER	'Can you help with the size?' 'How do I know what size to order?' 'How do I know the size of my bed?'
LEAD_GEN	'Get in Touch' 'Want to talk to an live agent' ' Please call me'
CHECK_PINCODE	'Do you deliver to my pincode' 'Check pincode' 'Is delivery possible on this pincode'
DISTRIBUTORS	'Do you have any showrooms in Delhi state' 'Do you have any distributors in Mumbai city' 'Do you have any retailers in Pune city'
MATTRESS_COST	'Price of mattress' 'Mattress cost' 'Cost of mattress'
PRODUCT_VARIANTS	'What are the product variants' 'Product Variants' 'Help me with different products'
ABOUT_SOF_MATTRESS	'How is SOF different from other mattress brands' 'Why SOF mattress' 'About SOF Mattress'
DELAY_IN_DELIVERY	"It's been a month" 'Why so long?' 'I did not receive my order yet'
ORDER_STATUS	'Order Status' 'What is my order status?' 'Order related'
RETURN_EXCHANGE	'Need my money back' 'I want refund' 'Refund'
CANCEL_ORDER	'I want to cancel my order' 'How can I cancel my order' 'Cancel order'
PILLOWS	'Can I get pillows?' 'Do you sell pillows?' 'Pillows'
OFFERS	'Offers' 'What are the available offers' 'Give me some discount'

Uses

Direct Use for Inference

First install the SetFit library:

pip install setfit

Then you can load this model and run inference.

from setfit import SetFitModel

# Download from the 🤗 Hub
model = SetFitModel.from_pretrained("huiyeong/setfit-sofmattress")
# Run inference
preds = model("Do you deliver in Canada")

Training Details

Training Set Metrics

Training set	Min	Median	Max
Word count	1	4.3110	22

Label	Training Sample Count
100_NIGHT_TRIAL_OFFER	18
ABOUT_SOF_MATTRESS	11
CANCEL_ORDER	10
CHECK_PINCODE	10
COD	12
COMPARISON	11
DELAY_IN_DELIVERY	11
DISTRIBUTORS	34
EMI	25
ERGO_FEATURES	11
LEAD_GEN	21
MATTRESS_COST	22
OFFERS	10
ORDER_STATUS	21
ORTHO_FEATURES	17
PILLOWS	10
PRODUCT_VARIANTS	21
RETURN_EXCHANGE	14
SIZE_CUSTOMIZATION	9
WARRANTY	10
WHAT_SIZE_TO_ORDER	20

Training Hyperparameters

batch_size: (16, 16)
num_epochs: (5, 5)
max_steps: -1
sampling_strategy: oversampling
num_iterations: 5
body_learning_rate: (2e-05, 1e-05)
head_learning_rate: 0.01
loss: CosineSimilarityLoss
distance_metric: cosine_distance
margin: 0.25
end_to_end: False
use_amp: False
warmup_proportion: 0.1
l2_weight: 0.01
seed: 42
eval_max_steps: -1
load_best_model_at_end: False

Training Results

Epoch	Step	Training Loss	Validation Loss
0.0049	1	0.2688	-
0.2439	50	0.1598	-
0.4878	100	0.1194	-
0.7317	150	0.0722	-
0.9756	200	0.0475	-
1.2195	250	0.0303	-
1.4634	300	0.0288	-
1.7073	350	0.0226	-
1.9512	400	0.0165	-
2.1951	450	0.012	-
2.4390	500	0.0114	-
2.6829	550	0.0105	-
2.9268	600	0.0092	-
3.1707	650	0.007	-
3.4146	700	0.0051	-
3.6585	750	0.0068	-
3.9024	800	0.0062	-
4.1463	850	0.0058	-
4.3902	900	0.0054	-
4.6341	950	0.0048	-
4.8780	1000	0.0043	-

Framework Versions

Python: 3.11.13
SetFit: 1.1.2
Sentence Transformers: 4.1.0
Transformers: 4.52.4
PyTorch: 2.6.0+cu124
Datasets: 3.6.0
Tokenizers: 0.21.1

Citation

BibTeX

@article{https://doi.org/10.48550/arxiv.2209.11055,
    doi = {10.48550/ARXIV.2209.11055},
    url = {https://arxiv.org/abs/2209.11055},
    author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
    keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
    title = {Efficient Few-Shot Learning Without Prompts},
    publisher = {arXiv},
    year = {2022},
    copyright = {Creative Commons Attribution 4.0 International}
}

Huiyeong
/

setfit-sofmattress