masani/SFT_gsm8k_train_size_256_Llama-3.2-1B_epoch_4_global_step_4 Text Generation • 1B • Updated May 13 • 21
masani/SFT_gsm8k_train_size_1024_Llama-3.2-1B_epoch_2_global_step_8 Text Generation • 1B • Updated May 13 • 27
masani/SFT_gsm8k_train_size_512_Llama-3.2-1B_epoch_3_global_step_6 Text Generation • 1B • Updated May 13 • 21
masani/SFT_gsm8k_train_size_256_Llama-3.2-1B_epoch_5_global_step_5 Text Generation • 1B • Updated May 13 • 3
masani/SFT_gsm8k_train_size_4096_Llama-3.2-1B_epoch_1_global_step_16 Text Generation • 1B • Updated May 13 • 22
masani/SFT_gsm8k_train_size_1024_Llama-3.2-1B_epoch_1_global_step_4 Text Generation • 1B • Updated May 13 • 3
masani/SFT_gsm8k_train_size_2048_Llama-3.2-1B_epoch_1_global_step_8 Text Generation • 1B • Updated May 13 • 23
masani/SFT_gsm8k_train_size_512_Llama-3.2-1B_epoch_1_global_step_2 Text Generation • 1B • Updated May 13 • 3
masani/SFT_gsm8k_train_size_256_Llama-3.2-1B_epoch_1_global_step_1 Text Generation • 1B • Updated May 13 • 3