Edit model card

tweet_instruct_detect

This model is a fine-tuned version of microsoft/Multilingual-MiniLM-L12-H384 on an dataset combining manually labelled tweets into either instructions or spam, and pre-processed instructions from the flan dataset that are less than 250 characters long to be used as positive instructions. It achieves the following results on the evaluation set based on the best checkpoint:

  • Loss: 0.1102
  • Accuracy: 0.9751

Model description

This model is trained to help determine if tweets are useful instructions. This can be used to filter the large corpus of tweet data online into useful instruction datasets for instruction fine-tuning.

Intended uses & limitations

Intended to be used to determine if tweets are useful instructions.

The model will be biased towards english data, and maybe be biased towards certain ways of phrasing "instructions". Instructions in this case may also be questions.

Current version of the model is very basic and can get confused by simple things. For example, simply adding a ? character will bias it heavily towards an instruction, even if using the same sentence so it is highly sensitive to certain characters and ways of phrasing things. This can hopefully be fixed by better training data or model tuning.

Update: Latest version should be less sensitive to "?" characters. I randomly modified the training data to include those. This reduces overall performance by a tiny bit, but should make it less sensitive to specific characters to generalize better to how people talk on Twitter.

Training and evaluation data

Model was fine-tuned on a relatively small number of tweets and instructions.

  • Train data: 837 examples
  • Test data: 281 examples

Out of the total number of examples, 600 of them were manually labelled tweets, most of which were spam due to the high noise ratio in tweets. Spam in this case can refer to actual spam, gibberish, or also statements that are generally fine but not useful as an instruction or question.

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 20

Training results

Training Loss Epoch Step Validation Loss Accuracy
No log 1.0 53 0.3291 0.9537
No log 2.0 106 0.1896 0.9537
No log 3.0 159 0.1724 0.9573
No log 4.0 212 0.1102 0.9751
No log 5.0 265 0.1450 0.9644
No log 6.0 318 0.1223 0.9715
No log 7.0 371 0.1434 0.9680
No log 8.0 424 0.1400 0.9680
No log 9.0 477 0.1349 0.9715
0.1523 10.0 530 0.1370 0.9715
0.1523 11.0 583 0.1376 0.9715
0.1523 12.0 636 0.1385 0.9715
0.1523 13.0 689 0.1392 0.9715
0.1523 14.0 742 0.1399 0.9715
0.1523 15.0 795 0.1395 0.9715
0.1523 16.0 848 0.1402 0.9715
0.1523 17.0 901 0.1462 0.9680
0.1523 18.0 954 0.1533 0.9680
0.0492 19.0 1007 0.1472 0.9680
0.0492 20.0 1060 0.1452 0.9680

Framework versions

  • Transformers 4.26.0
  • Pytorch 1.13.1
  • Datasets 2.9.0
  • Tokenizers 0.13.2
Downloads last month
7
Safetensors
Model size
118M params
Tensor type
I64
·
F32
·
Inference Examples
Inference API (serverless) is not available, repository is disabled.

Model tree for jmete/tweet_instruct_detect

Finetuned
this model