view article Article πΊπ¦ββ¬ LLM Comparison/Test: DeepSeek-V3, QVQ-72B-Preview, Falcon3 10B, Llama 3.3 70B, Nemotron 70B in my updated MMLU-Pro CS benchmark By wolfram β’ Jan 2 β’ 41
PaliGemma Release Collection Pretrained and mix checkpoints for PaliGemma β’ 16 items β’ Updated 13 days ago β’ 147
Judging LLM-as-a-judge with MT-Bench and Chatbot Arena Paper β’ 2306.05685 β’ Published Jun 9, 2023 β’ 36
Recent highlights Collection Some recent models worth checking out β’ 18 items β’ Updated Nov 1, 2024 β’ 53
Qwen1.5 Collection Qwen1.5 is the improved version of Qwen, the large language model series developed by Alibaba Cloud. β’ 55 items β’ Updated 1 day ago β’ 209
Scalable Pre-training of Large Autoregressive Image Models Paper β’ 2401.08541 β’ Published Jan 16, 2024 β’ 39