Update README.md
Browse files
README.md
CHANGED
@@ -82,15 +82,6 @@ Hyperparameters:
|
|
82 |
"gradient_accumulation_steps" : 4
|
83 |
}
|
84 |
```
|
85 |
-
Model was trained on 1xA100 80GB, below loss and memory consmuption details:
|
86 |
-
{'eval_loss': 0.9614351987838745, 'eval_runtime': 244.0411, 'eval_samples_per_second': 2.663, 'eval_steps_per_second': 0.668, 'epoch': 3.0}
|
87 |
-
{'train_runtime': 19718.5285, 'train_samples_per_second': 0.781, 'train_steps_per_second': 0.049, 'train_loss': 0.8241131883172602, 'epoch': 3.0}
|
88 |
-
Total training time 19720.924563884735
|
89 |
-
328.64 minutes used for training.
|
90 |
-
Peak reserved memory = 35.789 GB.
|
91 |
-
Peak reserved memory for training = 27.848 GB.
|
92 |
-
Peak reserved memory % of max memory = 45.216 %.
|
93 |
-
Peak reserved memory for training % of max memory = 35.183 %.
|
94 |
|
95 |
|
96 |
## Evaluation
|
|
|
82 |
"gradient_accumulation_steps" : 4
|
83 |
}
|
84 |
```
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
85 |
|
86 |
|
87 |
## Evaluation
|