SetFit with BAAI/bge-small-en-v1.5

This is a SetFit model that can be used for Text Classification. This SetFit model uses BAAI/bge-small-en-v1.5 as the Sentence Transformer embedding model. A LogisticRegression instance is used for classification.

The model has been trained using an efficient few-shot learning technique that involves:

Fine-tuning a Sentence Transformer with contrastive learning.
Training a classification head with features from the fine-tuned Sentence Transformer.

Model Details

Model Description

Model Type: SetFit
Sentence Transformer body: BAAI/bge-small-en-v1.5
Classification head: a LogisticRegression instance
Maximum Sequence Length: 512 tokens
Number of Classes: 7 classes

Model Sources

Repository: SetFit on GitHub
Paper: Efficient Few-Shot Learning Without Prompts
Blogpost: SetFit: Efficient Few-Shot Learning Without Prompts

Model Labels

Label	Examples
English	"Can you tell me about your favorite book? I love 'Harry Potter' because it's full of magic and adventure." 'What did you learn about poems today? We learned about rhymes and how they create a rhythm in poems.' "Can you make a sentence using the word 'enigmatic'? The old man's smile was enigmatic, making me wonder what secrets he hid."
Math	"What is 8 times 9? It's 72." 'How do you find the area of a rectangle? Multiply the length by the width.' "What's the difference between a prime number and a composite number? A prime number has only two factors, 1 and itself, while a composite number has more than two factors."
Art	'What colors do you mix to make green? Yellow and blue make green.' 'Who painted the Mona Lisa? Leonardo da Vinci painted it.' "What's the difference between sculpture and pottery? Sculpture is the art of making figures while pottery is specifically making vessels from clay."
Science	"What is photosynthesis? It's the process by which plants make their food using sunlight." 'Can you name the planets in our solar system? Mercury, Venus, Earth, Mars, Jupiter, Saturn, Uranus, and Neptune.' "What's the difference between a solid and a liquid? A solid has a fixed shape while a liquid takes the shape of its container."
History	'Who was the first president of the United States? George Washington was the first president.' 'Can you tell me about the Egyptian pyramids? They were massive tombs built for pharaohs, the biggest is the Pyramid of Giza.' 'What was the Renaissance? It was a period of great cultural and scientific advancement in Europe.'
Technology	"What is the Internet? It's a global network of computers that can share information." 'Can you name a famous computer scientist? Alan Turing is known as one of the fathers of computer science.' "What does 'AI' stand for? It stands for Artificial Intelligence."
NONE	'What did you have for lunch today? I had a sandwich and some fruit.' 'Do you like playing outside? Yes, I love playing soccer with my friends.' "What's your favorite TV show? I love watching 'SpongeBob SquarePants'."

Uses

Direct Use for Inference

First install the SetFit library:

pip install setfit

Then you can load this model and run inference.

from setfit import SetFitModel

# Download from the 🤗 Hub
model = SetFitModel.from_pretrained("bew/setfit-subject-model-basic")
# Run inference
preds = model("Who was Cleopatra? She was a queen of ancient Egypt.")

Training Details

Training Set Metrics

Training set	Min	Median	Max
Word count	6	14.1333	30

Label	Training Sample Count
Art	10
English	10
History	10
Math	10
NONE	15
Science	10
Technology	10

Training Hyperparameters

batch_size: (32, 32)
num_epochs: (10, 10)
max_steps: -1
sampling_strategy: oversampling
body_learning_rate: (2e-05, 1e-05)
head_learning_rate: 0.01
loss: CosineSimilarityLoss
distance_metric: cosine_distance
margin: 0.25
end_to_end: False
use_amp: False
warmup_proportion: 0.1
seed: 42
eval_max_steps: -1
load_best_model_at_end: False

Training Results

Epoch	Step	Training Loss	Validation Loss
0.0067	1	0.1987	-
0.3333	50	0.1814	-
0.6667	100	0.128	-
1.0	150	0.0146	-
1.3333	200	0.006	-
1.6667	250	0.0037	-
2.0	300	0.0031	-
2.3333	350	0.0027	-
2.6667	400	0.0024	-
3.0	450	0.0024	-
3.3333	500	0.002	-
3.6667	550	0.002	-
4.0	600	0.0017	-
4.3333	650	0.0019	-
4.6667	700	0.0018	-
5.0	750	0.0014	-
5.3333	800	0.0013	-
5.6667	850	0.0014	-
6.0	900	0.0014	-
6.3333	950	0.0014	-
6.6667	1000	0.0016	-
7.0	1050	0.0013	-
7.3333	1100	0.0013	-
7.6667	1150	0.0012	-
8.0	1200	0.0014	-
8.3333	1250	0.001	-
8.6667	1300	0.0012	-
9.0	1350	0.0014	-
9.3333	1400	0.0012	-
9.6667	1450	0.0012	-
10.0	1500	0.0011	-

Framework Versions

Python: 3.10.12
SetFit: 1.0.3
Sentence Transformers: 2.3.1
Transformers: 4.35.2
PyTorch: 2.1.0+cu121
Datasets: 2.17.0
Tokenizers: 0.15.2

Citation

BibTeX

@article{https://doi.org/10.48550/arxiv.2209.11055,
    doi = {10.48550/ARXIV.2209.11055},
    url = {https://arxiv.org/abs/2209.11055},
    author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
    keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
    title = {Efficient Few-Shot Learning Without Prompts},
    publisher = {arXiv},
    year = {2022},
    copyright = {Creative Commons Attribution 4.0 International}
}

bew
/

setfit-subject-model-basic