gupta-tanish/llama-3-8b-instruct-refa-budget_length-256-lamda-1.0-iteration2 Text Generation • Updated 17 days ago • 16
gupta-tanish/llama-3-8b-instruct-refa-budget_length-256-lamda-20.0-iteration1 Text Generation • Updated 18 days ago • 54
gupta-tanish/llama-3-8b-instruct-refa-lr-1e-6-beta10-gamma4-lambda-1.0-eos-increase-iteration2-lamda-0.1 Text Generation • Updated 19 days ago • 59
gupta-tanish/llama-3-8b-instruct-refa-lr-1e-6-beta10-gamma4-lambda-0.1-eos-increase-iteration2 Text Generation • Updated 19 days ago • 43
gupta-tanish/llama3-8b-instruct-refa-eos-increase-lamda-0.001-lr-1e-6-iteration1 Text Generation • Updated 20 days ago • 89
gupta-tanish/llama3-8b-instruct-refa-eos-increase-lamda-0.01-lr-1e-6-iteration1 Text Generation • Updated 20 days ago • 46
gupta-tanish/llama3-8b-instruct-refa-eos-increase-lamda-0.1-lr-1e-6-iteration1 Text Generation • Updated 20 days ago • 44
gupta-tanish/llama3-8b-instruct-refa-eos-increase-lamda-1.0-lr-1e-6-iteration1 Text Generation • Updated 20 days ago • 60
gupta-tanish/Filtered-QwQ-Long-CoT-10k-subset-Llama3.1-8B-Instruct-pertubation-generation-masked-new Viewer • Updated about 2 hours ago • 12.9k
gupta-tanish/Filtered-QwQ-Long-CoT-10k-subset-Llama3.1-8B-Instruct-pertubation-generation-masked-old Viewer • Updated about 2 hours ago • 31.1k
gupta-tanish/Filtered-QwQ-Long-CoT-10k-subset-Llama3.1-8B-Instruct-pertubation-generation-masked Viewer • Updated about 17 hours ago • 33.5k
gupta-tanish/QwQ-Long-CoT-10k-subset-Llama3.1-8B-Instruct-pertubation-generation Viewer • Updated about 23 hours ago • 41.5k
gupta-tanish/Filtered-QwQ-Long-CoT-10k-subset-Llama3.1-8B-Instruct-training-data-iteration2 Viewer • Updated 3 days ago • 200
gupta-tanish/Filtered-QwQ-Long-CoT-10k-subset-Llama3.1-8B-Instruct-training-data-Step-MPO Viewer • Updated 3 days ago • 8.8k • 9
gupta-tanish/Filtered-QwQ-Long-CoT-10k-subset-Llama3.1-8B-Instruct-training-data Viewer • Updated 6 days ago • 9.05k • 51
gupta-tanish/Filtered-QwQ-Long-CoT-10k-subset-Llama3.1-8B-Instruct-pertubation-generation-logps Viewer • Updated 6 days ago • 4.32k • 39
gupta-tanish/QwQ-Long-CoT-10k-subset-Llama3.1-8B-Instruct-pertubation-generation-logps Viewer • Updated 6 days ago • 41.5k • 41
gupta-tanish/QwQ-Long-CoT-10k-subset-Llama3.1-8B-Instruct-on-policy-alignment-pertubation-generation-logps Viewer • Updated 6 days ago • 38 • 46