Hyperparameters:

  • learning rate: 2e-5
  • weight decay: 0.01
  • per_device_train_batch_size: 8
  • per_device_eval_batch_size: 8
  • gradient_accumulation_steps:1
  • eval steps: 6000
  • max_length: 512
  • num_epochs: 2

Dataset version:

  • “craffel/tasky_or_not”, “10xp3_10xc4”, “15f88c8”

Checkpoint:

  • 48000 steps

Results on Validation set:

Step Training Loss Validation Loss Accuracy Precision Recall F1
6000 0.031900 0.163412 0.982194 0.999211 0.980462 0.989748
12000 0.014700 0.106132 0.976666 0.999639 0.973733 0.986516
18000 0.010700 0.043012 0.995743 0.999223 0.995918 0.997568
24000 0.007400 0.095047 0.984724 0.999857 0.982714 0.991211
30000 0.004100 0.087274 0.990400 0.999829 0.989217 0.994495
36000 0.003100 0.162909 0.981972 1.000000 0.979434 0.989610
42000 0.002200 0.148721 0.980454 0.999986 0.977717 0.988726
48000 0.001000 0.094455 0.990437 0.999943 0.989147 0.994516
Downloads last month
8
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Dataset used to train taskydata/deberta-v3-base_10xp3_10xc4_512