AdaDecode
Collection
9 items
โข
Updated
This model is a fine-tuned version of meta-llama/CodeLlama-34b-Instruct-hf on the meng-lab/CodeLlama-34B-Instruct-gsm8k dataset. It achieves the following results on the evaluation set:
More information needed
More information needed
More information needed
The following hyperparameters were used during training:
Training Loss | Epoch | Step | Validation Loss | Loss Layer 6 Head | Loss Layer 12 Head | Loss Layer 18 Head | Loss Layer 24 Head | Loss Layer 30 Head | Loss Layer 36 Head | Loss Layer 42 Head |
---|---|---|---|---|---|---|---|---|---|---|
2.6241 | 25.8065 | 200 | 4.3768 | 1.3707 | 1.0927 | 0.9492 | 0.4907 | 0.2888 | 0.1534 | 0.0899 |
1.6189 | 51.6129 | 400 | 4.0476 | 1.3067 | 0.9916 | 0.9104 | 0.4445 | 0.2716 | 0.1405 | 0.0663 |
1.3737 | 77.4194 | 600 | 4.0230 | 1.2898 | 1.0049 | 0.9093 | 0.4408 | 0.2683 | 0.1391 | 0.0639 |
Base model
meta-llama/CodeLlama-34b-Instruct-hf