pszemraj commited on
Commit
a6ad7c9
·
verified ·
1 Parent(s): c6cfc9d

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +24 -39
README.md CHANGED
@@ -4,19 +4,17 @@ language:
4
  - en
5
  license: apache-2.0
6
  base_model: pszemraj/tFINE-850m-24x24-v0.4-flan_aug
7
- tags:
8
- - generated_from_trainer
9
  metrics:
10
  - rouge
11
- model-index:
12
- - name: tFINE-850m-24x24-v0.4-flan_aug-infinity-instruct-7m-T2T_en-1024-v5
13
- results: []
14
  ---
15
 
16
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
17
  should probably proofread and complete it, then remove this comment. -->
18
 
19
- # tFINE-850m-24x24-v0.4-flan_aug-infinity-instruct-7m-T2T_en-1024-v5
20
 
21
  This model is a fine-tuned version of [pszemraj/tFINE-850m-24x24-v0.4-flan_aug](https://huggingface.co/pszemraj/tFINE-850m-24x24-v0.4-flan_aug) on the pszemraj/infinity-instruct-7m-T2T_en dataset.
22
  It achieves the following results on the evaluation set:
@@ -28,6 +26,26 @@ It achieves the following results on the evaluation set:
28
  - Gen Len: 441.475
29
  - Num Input Tokens Seen: 435513684
30
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
31
  ## Quick eval
32
 
33
  Quick eval for: `pszemraj/tFINE-850m-24x24-v0.5-instruct-L1`
@@ -49,36 +67,3 @@ hf (pretrained=pszemraj/tFINE-850m-24x24-v0.5-instruct-L1,trust_remote_code=True
49
  |tinyMMLU | 0|none | 0|acc_norm |↑ |0.3021|± | N/A|
50
  |winogrande | 1|none | 0|acc |↑ |0.4925|± |0.0141|
51
 
52
- ## Training procedure
53
-
54
- ### Training hyperparameters
55
-
56
- The following hyperparameters were used during training:
57
- - learning_rate: 3e-05
58
- - train_batch_size: 4
59
- - eval_batch_size: 4
60
- - seed: 776444
61
- - gradient_accumulation_steps: 32
62
- - total_train_batch_size: 128
63
- - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
64
- - lr_scheduler_type: constant_with_warmup
65
- - lr_scheduler_warmup_ratio: 0.05
66
- - num_epochs: 1.0
67
-
68
- ### Training results
69
-
70
- | Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | Gen Len | Input Tokens Seen |
71
- |:-------------:|:------:|:-----:|:---------------:|:-------:|:-------:|:-------:|:---------:|:-------:|:-----------------:|
72
- | 1.8808 | 0.0807 | 1000 | 1.7883 | 24.1946 | 12.2099 | 20.4185 | 22.251 | 636.465 | 35147692 |
73
- | 1.6545 | 0.1613 | 2000 | 1.5985 | 28.9492 | 15.3233 | 23.871 | 26.9919 | 577.04 | 70510224 |
74
- | 1.5522 | 0.2420 | 3000 | 1.4907 | 30.4033 | 16.1354 | 24.7244 | 28.5037 | 537.77 | 105707144 |
75
- | 1.5059 | 0.3227 | 4000 | 1.4204 | 34.0294 | 19.2608 | 27.9322 | 32.3166 | 522.495 | 140722844 |
76
- | 1.4346 | 0.4034 | 5000 | 1.3636 | 34.4104 | 19.4149 | 28.1022 | 32.7299 | 494.68 | 175639924 |
77
- | 1.3912 | 0.4840 | 6000 | 1.3159 | 36.5059 | 21.2447 | 30.116 | 34.7303 | 469.885 | 210409328 |
78
- | 1.3148 | 0.5647 | 7000 | 1.2807 | 37.0123 | 21.3666 | 30.11 | 35.0891 | 458.28 | 245601908 |
79
- | 1.2859 | 0.6454 | 8000 | 1.2492 | 37.05 | 21.0468 | 29.7988 | 35.1882 | 452.495 | 280866724 |
80
- | 1.298 | 0.7260 | 9000 | 1.2211 | 36.6966 | 20.8189 | 29.7115 | 34.7528 | 464.37 | 316042068 |
81
- | 1.2834 | 0.8067 | 10000 | 1.1979 | 37.7181 | 20.9926 | 30.3857 | 35.8681 | 446.26 | 351056548 |
82
- | 1.2577 | 0.8874 | 11000 | 1.1752 | 39.3539 | 23.0123 | 31.9005 | 37.4941 | 424.445 | 386471860 |
83
- | 1.193 | 0.9680 | 12000 | 1.1526 | 40.1804 | 23.1008 | 32.3484 | 38.2103 | 422.225 | 421585440 |
84
-
 
4
  - en
5
  license: apache-2.0
6
  base_model: pszemraj/tFINE-850m-24x24-v0.4-flan_aug
 
 
7
  metrics:
8
  - rouge
9
+ datasets:
10
+ - pszemraj/infinity-instruct-7m-T2T_en
11
+ pipeline_tag: text2text-generation
12
  ---
13
 
14
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
15
  should probably proofread and complete it, then remove this comment. -->
16
 
17
+ # tFINE-850m-24x24-v0.5-instruct-L1
18
 
19
  This model is a fine-tuned version of [pszemraj/tFINE-850m-24x24-v0.4-flan_aug](https://huggingface.co/pszemraj/tFINE-850m-24x24-v0.4-flan_aug) on the pszemraj/infinity-instruct-7m-T2T_en dataset.
20
  It achieves the following results on the evaluation set:
 
26
  - Gen Len: 441.475
27
  - Num Input Tokens Seen: 435513684
28
 
29
+ ## usage
30
+
31
+ ```py
32
+ from transformers import pipeline
33
+
34
+ pipe = pipeline(
35
+ "text2text-generation", model="pszemraj/tFINE-850m-24x24-v0.5-instruct-L1"
36
+ )
37
+ prompt = "write a python script to download a file from a url and save as a local file using requests. explain how it works"
38
+ res = pipe(
39
+ prompt,
40
+ max_new_tokens=192,
41
+ top_k=4,
42
+ penalty_alpha=0.6,
43
+ renormalize_logits=True,
44
+ no_repeat_ngram_size=5,
45
+ )
46
+ print(res[0]["generated_text"])
47
+ ```
48
+
49
  ## Quick eval
50
 
51
  Quick eval for: `pszemraj/tFINE-850m-24x24-v0.5-instruct-L1`
 
67
  |tinyMMLU | 0|none | 0|acc_norm |↑ |0.3021|± | N/A|
68
  |winogrande | 1|none | 0|acc |↑ |0.4925|± |0.0141|
69