-
-
-
-
-
-
Inference Providers
Active filters:
dpo
NicholasCorrado/zephyr-7b-uf-rlced-conifer-group-dpo-2e
Text Generation
•
7B
•
Updated
•
2
KoNqUeRoR3891/HW2-dpo
Text Generation
•
0.1B
•
Updated
•
4
nomadrp/tq-aya101-gt2
nomadrp/tq-llama3.1-gt3
Updated
•
163
NicholasCorrado/zephyr-7b-uf-rlced-conifer-1e2e-group-dpo-2e
Text Generation
•
7B
•
Updated
•
2
nomadrp/tq-llama3.1-sent-shlfd-gt3
QuantFactory/Lama-DPOlphin-8B-GGUF
Text Generation
•
8B
•
Updated
•
77
•
2
LBK95/Llama-2-7b-hf-DPO-LookAhead5_FullEval_TTree1.4_TLoop0.7_TEval0.2_V1.0
Wenboz/zephyr-7b-wpo-lora
YYYYYYibo/gshf_ours_1_iter_2
7B
•
Updated
•
2
Magpie-Align/MagpieLM-4B-Chat-v0.1
Text Generation
•
5B
•
Updated
•
53
•
20
Triangle104/NeuralDaredevil-8B-abliterated-Q4_K_M-GGUF
8B
•
Updated
•
3
Triangle104/NeuralDaredevil-8B-abliterated-Q4_0-GGUF
8B
•
Updated
•
5
Triangle104/NeuralDaredevil-8B-abliterated-Q4_K_S-GGUF
8B
•
Updated
•
9
YYYYYYibo/gshf_ours_1_iter_3
7B
•
Updated
•
2
lewtun/dpo-model-lora
CharlesLi/OpenELM-1_1B-DPO-full-max-min-reward
Text Generation
•
1B
•
Updated
•
2
CharlesLi/OpenELM-1_1B-DPO-full-max-random-reward
Text Generation
•
1B
•
Updated
•
2
CharlesLi/OpenELM-1_1B-DPO-full-least-similar
Text Generation
•
1B
•
Updated
•
2
taicheng/zephyr-7b-dpo-qlora
CharlesLi/OpenELM-1_1B-DPO-full-max-reward-least-similar
Text Generation
•
1B
•
Updated
•
7
dmariko/SmolLM-360M-Instruct-dpo-15k
0.4B
•
Updated
•
2
QinLiuNLP/llama3-sudo-dpo-instruct-5epochs-0909
CharlesLi/OpenELM-1_1B-DPO-full-max-reward-most-similar
Text Generation
•
1B
•
Updated
•
2
CharlesLi/OpenELM-1_1B-DPO-full-most-similar
Text Generation
•
1B
•
Updated
•
2
DUAL-GPO/phi-2-dpo-chatml-lora-i1
CharlesLi/OpenELM-1_1B-DPO-full-max-second-reward
Text Generation
•
1B
•
Updated
•
3
CharlesLi/OpenELM-1_1B-DPO-full-random-pair
Text Generation
•
1B
•
Updated
•
2
Wenboz/zephyr-7b-dpo-lora
DUAL-GPO/phi-2-dpo-chatml-lora-10k-30k-i1