mlfoundations-dev/Qwen-7B-Inst_flas-attn_fa2_pack_Fals_clau_3_7_2025_tben_trac_shar_cuto-len_1280_rope-scal_yarn Text Generation • 8B • Updated 6 days ago • 11
mlfoundations-dev/Qwen-7B-Inst_flas-attn_fa2_pack_Fals_clau_3_7_2025_tben_trac_shar_cuto-len_6400_rope-scal_yarn Text Generation • 8B • Updated 6 days ago • 11
mlfoundations-dev/Qwen-7B-Inst_flas-attn_fa2_pack_Fals_clau_3_7_2025_tben_trac_shar_cuto-len_3200_rope-scal_yarn Text Generation • 8B • Updated 6 days ago • 10
mlfoundations-dev/Qwen-7B-Inst_flas-attn_fa2_pack_Fals_clau_3_7_2025_tben_trac_shar_cuto-len_8000_rope-scal_yarn Text Generation • 8B • Updated 6 days ago • 7
mlfoundations-dev/Qwen-7B-Inst_flas-attn_fa2_pack_Fals_clau_3_7_2025_tben_trac_shar_cuto-len_1600_rope-scal_yarn Text Generation • 8B • Updated 6 days ago • 10
mlfoundations-dev/Qwen-7B-Inst_flas-attn_fa2_pack_Fals_clau_3_7_2025_tben_trac_shar_cuto-len_4000_rope-scal_yarn Text Generation • 8B • Updated 6 days ago • 8
Comparing Bad Apples to Good Oranges: Aligning Large Language Models via Joint Preference Optimization Paper • 2404.00530 • Published Mar 31, 2024
ModelCitizens: Representing Community Voices in Online Safety Paper • 2507.05455 • Published Jul 7 • 4
When Do Neural Nets Outperform Boosted Trees on Tabular Data? Paper • 2305.02997 • Published May 4, 2023
Scaling Laws for Robust Comparison of Open Foundation Language-Vision Models and Datasets Paper • 2506.04598 • Published Jun 5 • 6
Alice in Wonderland: Simple Tasks Showing Complete Reasoning Breakdown in State-Of-the-Art Large Language Models Paper • 2406.02061 • Published Jun 4, 2024 • 2
DataComp-LM: In search of the next generation of training sets for language models Paper • 2406.11794 • Published Jun 17, 2024 • 54
LiveBench: A Challenging, Contamination-Free LLM Benchmark Paper • 2406.19314 • Published Jun 27, 2024 • 23
TuneTables: Context Optimization for Scalable Prior-Data Fitted Networks Paper • 2402.11137 • Published Feb 17, 2024