Article 4 Introducing ConTextual: How well can your Multimodal model jointly reason over text and image in text-rich scenes?
hbXNov/videophy_autoeval_three_models_rule_e3_lr5e-4_bs64_part2_vta_pc_rule_ckpt502 Updated 29 days ago
hbXNov/llama_8b_instruct_distill_r1_q1p5b_balanced_train_e3_lr5e-7_all-ckpt_3278 Updated about 1 month ago • 63
hbXNov/llama_8b_instruct_distill_r1_q1p5b_balanced_train_e6_lr5e-7_balanced_ckpt-4386 Updated Mar 2 • 30
hbXNov/qwen_1p5b_base_distill_r1_q1p5b_balanced_train_e3_lr1e-5_balanced_ckpt_2193 Updated Mar 2 • 11
hbXNov/llama_8b_instruct_distill_r1_q1p5b_balanced_train_e3_lr5e-7_balanced_ckpt_2193 Updated Mar 1 • 12
hbXNov/llama_8b_instruct_distill_r1_q1p5b_balanced_train_e3_lr1e-5_balanced_ckpt_2193 Updated Mar 1 • 8
hbXNov/distill_r1_qwen_1p5B_gpt_4o_verify_remove_think_processed Viewer • Updated Feb 27 • 8.02k • 66
hbXNov/distill_r1_qwen_2.5_1.5b_32k_soln_gpt_4o_verify_remove_think Viewer • Updated Feb 26 • 7.38k • 49