Turkish-rewardbench / model_performance_categorical.csv
kesimeg's picture
Data files are added
2cf54ff verified
raw
history blame contribute delete
342 Bytes
Model,Average,Chat,Chat Hard,Safety,Reasoning
OpenAssistant/reward-model-deberta-v3-large-v2,0.555,0.339,0.608,0.526,0.746
Skywork/Skywork-Reward-Llama-3.1-8B-v0.2,0.85,0.868,0.782,0.864,0.885
allenai/tulu-2-dpo-7b,0.489,0.438,0.505,0.522,0.492
openbmb/UltraRM-13b,0.64,0.79,0.584,0.47,0.716
openbmb/Eurus-RM-7b,0.706,0.932,0.495,0.616,0.782