SetFit with sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2

This is a SetFit model that can be used for Text Classification. This SetFit model uses sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2 as the Sentence Transformer embedding model. A LogisticRegression instance is used for classification.

The model has been trained using an efficient few-shot learning technique that involves:

Fine-tuning a Sentence Transformer with contrastive learning.
Training a classification head with features from the fine-tuned Sentence Transformer.

Model Details

Model Description

Model Type: SetFit
Sentence Transformer body: sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2
Classification head: a LogisticRegression instance
Maximum Sequence Length: 128 tokens
Number of Classes: 28 classes

Model Sources

Repository: SetFit on GitHub
Paper: Efficient Few-Shot Learning Without Prompts
Blogpost: SetFit: Efficient Few-Shot Learning Without Prompts

Model Labels

Label	Examples
CALL_CENTER	'What time is your call centre operational during COVID?' 'is the call center still functioning during lockdown' 'what are the working hours of your call center during covid lockdown'
CANCEL_ORDER	"I'd like to cancel my pending order" 'How can I cancel my pending order?' 'Kindly cancel my order'
CHAT_WITH_AGENT	'Chat with agent' 'I need customer support' 'I want to chat with an agent'
CONSULT_START	'Tell me weight gaining' 'Consult Start' 'suggest me beginner diet'
DELAY_IN_PARCEL	'Is there a delay in delivery becuase of the pandemic?' 'How long is parcel delayed because of COVID?' 'Why is my delivery late'
EXPIRY_DATE	'What if I receive expired product' 'I have received an Expired product' 'Expiry Date'
FRANCHISE	'Get Franchise' 'would like to associated as seller' 'i want to enroll my self as a seller'
ORDER_STATUS	'Track my order' 'What is my shipment status' 'What is the progress of my orders'
INTERNATIONAL_SHIPPING	'Delivery out of India' 'International Shipping' 'Out of India'
MODES_OF_PAYMENTS	'Modes of payments' 'ways of paymets' 'Accepted modes of payments'
MODIFY_ADDRESS	'Change delivery address?' 'Delivery address is wrong it is to be changed' 'I want to change my delivery address'
ORDER_QUERY	'I have a query related to my order' 'Help required on order' 'details needed for my order'
ORDER_TAKING	'Are you taking orders during COVID?' 'i know its lockdown due to coronavirus but can i still place an order?' 'I wanted to order some things, can I place an order on the website?'
ORIGINAL_PRODUCT	'Original Products' 'do you have authentic products' 'Is your product original'
REFUNDS_RETURNS_REPLACEMENTS	'I want to know my refund status' 'I want to know about my replacements' 'I havent received my refund it has been many days since the return'
PAYMENT_AND_BILL	'I want to know about my payments' 'Payments and Bills' 'I have Payment & Bill Related Queries'
PORTAL_ISSUE	'Portal not working' 'Option is not visible' 'Unable to see product in my cart'
CHECK_PINCODE	'Product Service' 'pincode serviceable' 'I wanted to know whether you are delivering in'
RECOMMEND_PRODUCT	'Recommend a product' 'What are all the products you have?' "I am confused about what to buy since there are too many options I and I really don't know what I should focus on right now"
REFER_EARN	'Reedem referral' 'My friend refer me to CureKart' 'Refer Amount'
RESUME_DELIVERY	'When will you resume delivery due to COVID?' 'are you going to start delivery during this lockdown period as well?' 'other websites like lagoon are delivering when will curekart start again to deliver?'
SIDE_EFFECT	'It has any side effects or not' 'Does it have side effects?' 'is there any side effects'
SIGN_UP	'New to CureKart?' 'Where can I sign up' 'I am a new user'
START_OVER	'Show me the main menu' 'Start again' 'Start over'
STORE_INFORMATION	'Can I visit your store' 'Àre your shops operational' 'Are stores still opening?'
USER_GOAL_FORM	'Re-assess my profile' 'I would want to take re-assessment' 'Fill my goal'
WORK_FROM_HOME	'Is your head office working during lockdown?' 'is curekart office open during the lockdown?' 'I wanted to talk to contact your head office for some work but is it open?'
IMMUNITY	'How can I increase my immunity' 'I want to increase my immunity power' 'Increase immunity power'

Uses

Direct Use for Inference

First install the SetFit library:

pip install setfit

Then you can load this model and run inference.

from setfit import SetFitModel

# Download from the 🤗 Hub
model = SetFitModel.from_pretrained("huiyeong/setfit-curekart")
# Run inference
preds = model("+1 offer kya h")

Training Details

Training Set Metrics

Training set	Min	Median	Max
Word count	1	6.0417	26

Label	Training Sample Count
CALL_CENTER	21
CANCEL_ORDER	12
CHAT_WITH_AGENT	40
CHECK_PINCODE	14
CONSULT_START	26
DELAY_IN_PARCEL	23
EXPIRY_DATE	8
FRANCHISE	12
IMMUNITY	6
INTERNATIONAL_SHIPPING	3
MODES_OF_PAYMENTS	7
MODIFY_ADDRESS	16
ORDER_QUERY	7
ORDER_STATUS	47
ORDER_TAKING	39
ORIGINAL_PRODUCT	23
PAYMENT_AND_BILL	26
PORTAL_ISSUE	4
RECOMMEND_PRODUCT	95
REFER_EARN	13
REFUNDS_RETURNS_REPLACEMENTS	54
RESUME_DELIVERY	51
SIDE_EFFECT	4
SIGN_UP	7
START_OVER	5
STORE_INFORMATION	14
USER_GOAL_FORM	12
WORK_FROM_HOME	10

Training Hyperparameters

batch_size: (16, 16)
num_epochs: (5, 5)
max_steps: -1
sampling_strategy: oversampling
num_iterations: 5
body_learning_rate: (2e-05, 1e-05)
head_learning_rate: 0.01
loss: CosineSimilarityLoss
distance_metric: cosine_distance
margin: 0.25
end_to_end: False
use_amp: False
warmup_proportion: 0.1
l2_weight: 0.01
seed: 42
eval_max_steps: -1
load_best_model_at_end: False

Training Results

Epoch	Step	Training Loss	Validation Loss
0.0027	1	0.4136	-
0.1333	50	0.233	-
0.2667	100	0.1791	-
0.4	150	0.1243	-
0.5333	200	0.0921	-
0.6667	250	0.0745	-
0.8	300	0.0569	-
0.9333	350	0.0483	-
1.0667	400	0.0366	-
1.2	450	0.0304	-
1.3333	500	0.0264	-
1.4667	550	0.0247	-
1.6	600	0.0286	-
1.7333	650	0.0231	-
1.8667	700	0.0232	-
2.0	750	0.024	-
2.1333	800	0.0126	-
2.2667	850	0.0126	-
2.4	900	0.012	-
2.5333	950	0.0152	-
2.6667	1000	0.013	-
2.8	1050	0.0094	-
2.9333	1100	0.013	-
3.0667	1150	0.0079	-
3.2	1200	0.0087	-
3.3333	1250	0.0057	-
3.4667	1300	0.0047	-
3.6	1350	0.0073	-
3.7333	1400	0.0076	-
3.8667	1450	0.0089	-
4.0	1500	0.0074	-
4.1333	1550	0.0033	-
4.2667	1600	0.0063	-
4.4	1650	0.0057	-
4.5333	1700	0.0058	-
4.6667	1750	0.0039	-
4.8	1800	0.0055	-
4.9333	1850	0.0059	-

Framework Versions

Python: 3.11.13
SetFit: 1.1.2
Sentence Transformers: 4.1.0
Transformers: 4.52.4
PyTorch: 2.6.0+cu124
Datasets: 3.6.0
Tokenizers: 0.21.1

Citation

BibTeX

@article{https://doi.org/10.48550/arxiv.2209.11055,
    doi = {10.48550/ARXIV.2209.11055},
    url = {https://arxiv.org/abs/2209.11055},
    author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
    keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
    title = {Efficient Few-Shot Learning Without Prompts},
    publisher = {arXiv},
    year = {2022},
    copyright = {Creative Commons Attribution 4.0 International}
}

Huiyeong
/

setfit-curekart