gupta-tanish/llama3-8b-instruct-on-policy-mpo-iteration2-v3 Text Generation • Updated 13 days ago • 33
gupta-tanish/llama3-8b-instruct-on-policy-mpo-iteration1-v3 Text Generation • Updated 15 days ago • 47
gupta-tanish/llama3-8b-instruct-on-policy-mpo-iteration1-v2 Text Generation • Updated 17 days ago • 41
gupta-tanish/mistral-instruct-v0.2-on-policy-mpo-iteration2 Text Generation • Updated 21 days ago • 37
gupta-tanish/mistral-instruct-v0.2-on-policy-mpo-iteration1 Text Generation • Updated 22 days ago • 52
gupta-tanish/llama3-8b-instruct-on-policy-swepo-ultrainteract Viewer • Updated 9 days ago • 65.7k • 137
gupta-tanish/mistral-7b-instruct-on-policy-swepo-ultrainteract Viewer • Updated 9 days ago • 65.7k • 13
gupta-tanish/llama3-8b-instruct-on-policy-swepo-iteration3-v3 Viewer • Updated 12 days ago • 40k • 101
gupta-tanish/llama3-8b-instruct-on-policy-swepo-iteration2-v3 Viewer • Updated 14 days ago • 39.9k • 81
gupta-tanish/llama3-8b-instruct-on-policy-swepo-iteration2-v2 Viewer • Updated 17 days ago • 39.9k • 130
gupta-tanish/mistral-instruct-v0.2-on-policy-swepo-iteration3 Viewer • Updated 21 days ago • 40k • 66
gupta-tanish/mistral-instruct-v0.2-on-policy-swepo-iteration2 Viewer • Updated 21 days ago • 39.9k • 38