Model Details
meta-llama/Meta-Llama-3-8B
model finetuned on 100,000 CLRS-Text examples.
Training Details
- Learning Rate: 1e-4, 150 warmup steps then cosine decayed to 5e-06 using AdamW optimiser
- Batch size: 128
- Loss taken over answer only, not on question.
- Downloads last month
- 10
Inference Providers
NEW
This model is not currently available via any of the supported Inference Providers.
Model tree for smcleish/clrs_llama_3_8b_100k_finetune_with_traces
Base model
meta-llama/Meta-Llama-3-8B