SetFit with sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2

This is a SetFit model that can be used for Text Classification. This SetFit model uses sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2 as the Sentence Transformer embedding model. A LogisticRegression instance is used for classification.

The model has been trained using an efficient few-shot learning technique that involves:

Fine-tuning a Sentence Transformer with contrastive learning.
Training a classification head with features from the fine-tuned Sentence Transformer.

Model Details

Model Description

Model Type: SetFit
Sentence Transformer body: sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2
Classification head: a LogisticRegression instance
Maximum Sequence Length: 128 tokens
Number of Classes: 21 classes

Model Sources

Repository: SetFit on GitHub
Paper: Efficient Few-Shot Learning Without Prompts
Blogpost: SetFit: Efficient Few-Shot Learning Without Prompts

Model Labels

Label	Examples
EMI	'You guys provide EMI option?' 'Do you offer Zero Percent EMI payment options?' '0% EMI.'
COD	'COD option is availble?' 'Do you offer COD to my pincode?' 'Can I do COD?'
ORTHO_FEATURES	'Features of Ortho mattress' 'What are the key features of the SOF Ortho mattress' 'SOF ortho'
ERGO_FEATURES	'What are the key features of the SOF Ergo mattress' 'Features of Ergo mattress' 'SOF ergo'
COMPARISON	'What is the difference between the Ergo & Ortho variants' 'Difference between Ergo & Ortho Mattress' 'Difference between the products'
WARRANTY	'What is the warranty period?' 'Warranty' 'Does mattress cover is included in warranty'
100_NIGHT_TRIAL_OFFER	'How does the 100 night trial work' 'What is the 100-night offer' 'Trial details'
SIZE_CUSTOMIZATION	'I want to change the size of the mattress.' 'Need some help in changing size of the mattress' 'How can I order a custom sized mattress'
WHAT_SIZE_TO_ORDER	'Can you help with the size?' 'How do I know what size to order?' 'How do I know the size of my bed?'
LEAD_GEN	'Get in Touch' 'Want to talk to an live agent' ' Please call me'
CHECK_PINCODE	'Do you deliver to my pincode' 'Check pincode' 'Is delivery possible on this pincode'
DISTRIBUTORS	'Do you have any showrooms in Delhi state' 'Do you have any distributors in Mumbai city' 'Do you have any retailers in Pune city'
MATTRESS_COST	'Price of mattress' 'Mattress cost' 'Cost of mattress'
PRODUCT_VARIANTS	'What are the product variants' 'Product Variants' 'Help me with different products'
ABOUT_SOF_MATTRESS	'How is SOF different from other mattress brands' 'Why SOF mattress' 'About SOF Mattress'
DELAY_IN_DELIVERY	"It's been a month" 'Why so long?' 'I did not receive my order yet'
ORDER_STATUS	'Order Status' 'What is my order status?' 'Order related'
RETURN_EXCHANGE	'Need my money back' 'I want refund' 'Refund'
CANCEL_ORDER	'I want to cancel my order' 'How can I cancel my order' 'Cancel order'
PILLOWS	'Can I get pillows?' 'Do you sell pillows?' 'Pillows'
OFFERS	'Offers' 'What are the available offers' 'Give me some discount'

Uses

Direct Use for Inference

First install the SetFit library:

pip install setfit

Then you can load this model and run inference.

from setfit import SetFitModel

# Download from the 🤗 Hub
model = SetFitModel.from_pretrained("huiyeong/setfit-sofmattress-neg")
# Run inference
preds = model("Do you deliver in Canada")

Training Details

Training Set Metrics

Training set	Min	Median	Max
Word count	1	4.3619	22

Label	Training Sample Count
100_NIGHT_TRIAL_OFFER	20
ABOUT_SOF_MATTRESS	11
CANCEL_ORDER	11
CHECK_PINCODE	10
COD	12
COMPARISON	12
DELAY_IN_DELIVERY	12
DISTRIBUTORS	39
EMI	27
ERGO_FEATURES	13
LEAD_GEN	25
MATTRESS_COST	23
OFFERS	13
ORDER_STATUS	24
ORTHO_FEATURES	23
PILLOWS	13
PRODUCT_VARIANTS	24
RETURN_EXCHANGE	15
SIZE_CUSTOMIZATION	11
WARRANTY	13
WHAT_SIZE_TO_ORDER	22

Training Hyperparameters

batch_size: (16, 16)
num_epochs: (5, 5)
max_steps: -1
sampling_strategy: oversampling
num_iterations: 5
body_learning_rate: (2e-05, 1e-05)
head_learning_rate: 0.01
loss: CosineSimilarityLoss
distance_metric: cosine_distance
margin: 0.25
end_to_end: False
use_amp: False
warmup_proportion: 0.1
l2_weight: 0.01
seed: 42
eval_max_steps: -1
load_best_model_at_end: False

Training Results

Epoch	Step	Training Loss	Validation Loss
0.0043	1	0.2676	-
0.2137	50	0.1931	-
0.4274	100	0.1418	-
0.6410	150	0.1097	-
0.8547	200	0.0838	-
1.0684	250	0.0579	-
1.2821	300	0.0437	-
1.4957	350	0.0338	-
1.7094	400	0.0287	-
1.9231	450	0.0245	-
2.1368	500	0.0167	-
2.3504	550	0.0164	-
2.5641	600	0.0135	-
2.7778	650	0.0118	-
2.9915	700	0.0147	-
3.2051	750	0.0096	-
3.4188	800	0.008	-
3.6325	850	0.0094	-
3.8462	900	0.0084	-
4.0598	950	0.0107	-
4.2735	1000	0.0083	-
4.4872	1050	0.0068	-
4.7009	1100	0.0065	-
4.9145	1150	0.0064	-

Framework Versions

Python: 3.11.13
SetFit: 1.1.2
Sentence Transformers: 4.1.0
Transformers: 4.52.4
PyTorch: 2.6.0+cu124
Datasets: 3.6.0
Tokenizers: 0.21.1

Citation

BibTeX

@article{https://doi.org/10.48550/arxiv.2209.11055,
    doi = {10.48550/ARXIV.2209.11055},
    url = {https://arxiv.org/abs/2209.11055},
    author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
    keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
    title = {Efficient Few-Shot Learning Without Prompts},
    publisher = {arXiv},
    year = {2022},
    copyright = {Creative Commons Attribution 4.0 International}
}

Huiyeong
/

setfit-sofmattress-neg