SetFit with sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2

This is a SetFit model that can be used for Text Classification. This SetFit model uses sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2 as the Sentence Transformer embedding model. A LogisticRegression instance is used for classification.

The model has been trained using an efficient few-shot learning technique that involves:

Fine-tuning a Sentence Transformer with contrastive learning.
Training a classification head with features from the fine-tuned Sentence Transformer.

Model Details

Model Description

Model Type: SetFit
Sentence Transformer body: sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2
Classification head: a LogisticRegression instance
Maximum Sequence Length: 128 tokens
Number of Classes: 28 classes

Model Sources

Repository: SetFit on GitHub
Paper: Efficient Few-Shot Learning Without Prompts
Blogpost: SetFit: Efficient Few-Shot Learning Without Prompts

Model Labels

Label	Examples
CALL_CENTER	'What time is your call centre operational during COVID?' 'is the call center still functioning during lockdown' 'what are the working hours of your call center during covid lockdown'
CANCEL_ORDER	"I'd like to cancel my pending order" 'How can I cancel my pending order?' 'Kindly cancel my order'
CHAT_WITH_AGENT	'Chat with agent' 'I need customer support' 'I want to chat with an agent'
CONSULT_START	'Tell me weight gaining' 'Consult Start' 'suggest me beginner diet'
DELAY_IN_PARCEL	'Is there a delay in delivery becuase of the pandemic?' 'How long is parcel delayed because of COVID?' 'Why is my delivery late'
EXPIRY_DATE	'What if I receive expired product' 'I have received an Expired product' 'Expiry Date'
FRANCHISE	'Get Franchise' 'would like to associated as seller' 'i want to enroll my self as a seller'
ORDER_STATUS	'Track my order' 'What is my shipment status' 'What is the progress of my orders'
INTERNATIONAL_SHIPPING	'Delivery out of India' 'International Shipping' 'Out of India'
MODES_OF_PAYMENTS	'Modes of payments' 'ways of paymets' 'Accepted modes of payments'
MODIFY_ADDRESS	'Change delivery address?' 'Delivery address is wrong it is to be changed' 'I want to change my delivery address'
ORDER_QUERY	'I have a query related to my order' 'Help required on order' 'details needed for my order'
ORDER_TAKING	'Are you taking orders during COVID?' 'i know its lockdown due to coronavirus but can i still place an order?' 'I wanted to order some things, can I place an order on the website?'
ORIGINAL_PRODUCT	'Original Products' 'do you have authentic products' 'Is your product original'
REFUNDS_RETURNS_REPLACEMENTS	'I want to know my refund status' 'I want to know about my replacements' 'I havent received my refund it has been many days since the return'
PAYMENT_AND_BILL	'I want to know about my payments' 'Payments and Bills' 'I have Payment & Bill Related Queries'
PORTAL_ISSUE	'Portal not working' 'Option is not visible' 'Unable to see product in my cart'
CHECK_PINCODE	'Product Service' 'pincode serviceable' 'I wanted to know whether you are delivering in'
RECOMMEND_PRODUCT	'Recommend a product' 'What are all the products you have?' "I am confused about what to buy since there are too many options I and I really don't know what I should focus on right now"
REFER_EARN	'Reedem referral' 'My friend refer me to CureKart' 'Refer Amount'
RESUME_DELIVERY	'When will you resume delivery due to COVID?' 'are you going to start delivery during this lockdown period as well?' 'other websites like lagoon are delivering when will curekart start again to deliver?'
SIDE_EFFECT	'It has any side effects or not' 'Does it have side effects?' 'is there any side effects'
SIGN_UP	'New to CureKart?' 'Where can I sign up' 'I am a new user'
START_OVER	'Show me the main menu' 'Start again' 'Start over'
STORE_INFORMATION	'Can I visit your store' 'Àre your shops operational' 'Are stores still opening?'
USER_GOAL_FORM	'Re-assess my profile' 'I would want to take re-assessment' 'Fill my goal'
WORK_FROM_HOME	'Is your head office working during lockdown?' 'is curekart office open during the lockdown?' 'I wanted to talk to contact your head office for some work but is it open?'
IMMUNITY	'How can I increase my immunity' 'I want to increase my immunity power' 'Increase immunity power'

Uses

Direct Use for Inference

First install the SetFit library:

pip install setfit

Then you can load this model and run inference.

from setfit import SetFitModel

# Download from the 🤗 Hub
model = SetFitModel.from_pretrained("huiyeong/setfit-curekart-neg")
# Run inference
preds = model("+1 offer kya h")

Training Details

Training Set Metrics

Training set	Min	Median	Max
Word count	1	6.1288	26

Label	Training Sample Count
CALL_CENTER	22
CANCEL_ORDER	15
CHAT_WITH_AGENT	46
CHECK_PINCODE	17
CONSULT_START	31
DELAY_IN_PARCEL	28
EXPIRY_DATE	10
FRANCHISE	13
IMMUNITY	7
INTERNATIONAL_SHIPPING	3
MODES_OF_PAYMENTS	9
MODIFY_ADDRESS	18
ORDER_QUERY	11
ORDER_STATUS	55
ORDER_TAKING	48
ORIGINAL_PRODUCT	28
PAYMENT_AND_BILL	29
PORTAL_ISSUE	4
RECOMMEND_PRODUCT	106
REFER_EARN	13
REFUNDS_RETURNS_REPLACEMENTS	63
RESUME_DELIVERY	56
SIDE_EFFECT	4
SIGN_UP	10
START_OVER	5
STORE_INFORMATION	18
USER_GOAL_FORM	16
WORK_FROM_HOME	14

Training Hyperparameters

batch_size: (16, 16)
num_epochs: (5, 5)
max_steps: -1
sampling_strategy: oversampling
num_iterations: 5
body_learning_rate: (2e-05, 1e-05)
head_learning_rate: 0.01
loss: CosineSimilarityLoss
distance_metric: cosine_distance
margin: 0.25
end_to_end: False
use_amp: False
warmup_proportion: 0.1
l2_weight: 0.01
seed: 42
eval_max_steps: -1
load_best_model_at_end: False

Training Results

Epoch	Step	Training Loss	Validation Loss
0.0023	1	0.267	-
0.1144	50	0.2317	-
0.2288	100	0.1862	-
0.3432	150	0.1462	-
0.4577	200	0.1121	-
0.5721	250	0.0857	-
0.6865	300	0.0749	-
0.8009	350	0.0605	-
0.9153	400	0.0472	-
1.0297	450	0.0484	-
1.1442	500	0.0395	-
1.2586	550	0.0333	-
1.3730	600	0.0316	-
1.4874	650	0.0259	-
1.6018	700	0.0288	-
1.7162	750	0.0201	-
1.8307	800	0.0264	-
1.9451	850	0.0213	-
2.0595	900	0.0207	-
2.1739	950	0.0136	-
2.2883	1000	0.014	-
2.4027	1050	0.0126	-
2.5172	1100	0.0161	-
2.6316	1150	0.0105	-
2.7460	1200	0.01	-
2.8604	1250	0.0091	-
2.9748	1300	0.0107	-
3.0892	1350	0.0077	-
3.2037	1400	0.0073	-
3.3181	1450	0.0073	-
3.4325	1500	0.0067	-
3.5469	1550	0.0086	-
3.6613	1600	0.006	-
3.7757	1650	0.005	-
3.8902	1700	0.0046	-
4.0046	1750	0.0053	-
4.1190	1800	0.0043	-
4.2334	1850	0.0043	-
4.3478	1900	0.0058	-
4.4622	1950	0.0062	-
4.5767	2000	0.0041	-
4.6911	2050	0.0033	-
4.8055	2100	0.0045	-
4.9199	2150	0.0036	-

Framework Versions

Python: 3.11.13
SetFit: 1.1.2
Sentence Transformers: 4.1.0
Transformers: 4.52.4
PyTorch: 2.6.0+cu124
Datasets: 3.6.0
Tokenizers: 0.21.2

Citation

BibTeX

@article{https://doi.org/10.48550/arxiv.2209.11055,
    doi = {10.48550/ARXIV.2209.11055},
    url = {https://arxiv.org/abs/2209.11055},
    author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
    keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
    title = {Efficient Few-Shot Learning Without Prompts},
    publisher = {arXiv},
    year = {2022},
    copyright = {Creative Commons Attribution 4.0 International}
}

Huiyeong
/

setfit-curekart-neg