大模型idea - a anbinx Collection

anbinx 's Collections

大模型idea

updated 1 day ago

Instruction Following without Instruction Tuning

Paper • 2409.14254 • Published Sep 21, 2024 • 31
Baichuan Alignment Technical Report

Paper • 2410.14940 • Published Oct 19, 2024 • 52
CompassJudger-1: All-in-one Judge Model Helps Model Evaluation and Evolution

Paper • 2410.16256 • Published Oct 21, 2024 • 61
Infinity-MM: Scaling Multimodal Performance with Large-Scale and High-Quality Instruction Data

Paper • 2410.18558 • Published Oct 24, 2024 • 20
Self-Consistency Preference Optimization

Paper • 2411.04109 • Published Nov 6, 2024 • 19
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

Paper • 2501.12948 • Published Jan 22 • 405
Demystifying Long Chain-of-Thought Reasoning in LLMs

Paper • 2502.03373 • Published Feb 5 • 59
Qwen2.5-VL Technical Report

Paper • 2502.13923 • Published Feb 19 • 193
Chain of Draft: Thinking Faster by Writing Less

Paper • 2502.18600 • Published Feb 25 • 50
URECA: Unique Region Caption Anything

Paper • 2504.05305 • Published Apr 7 • 36
An Empirical Study of Qwen3 Quantization

Paper • 2505.02214 • Published May 4 • 24
BLIP3-o: A Family of Fully Open Unified Multimodal Models-Architecture, Training and Dataset

Paper • 2505.09568 • Published May 14 • 94
WorldPM: Scaling Human Preference Modeling

Paper • 2505.10527 • Published May 15 • 33
Does Math Reasoning Improve General LLM Capabilities? Understanding Transferability of LLM Reasoning

Paper • 2507.00432 • Published 3 days ago • 44