-
Chain-of-Knowledge: Integrating Knowledge Reasoning into Large Language Models by Learning from Knowledge Graphs
Paper • 2407.00653 • Published • 11 -
Step-DPO: Step-wise Preference Optimization for Long-chain Reasoning of LLMs
Paper • 2406.18629 • Published • 40 -
Whiteboard-of-Thought: Thinking Step-by-Step Across Modalities
Paper • 2406.14562 • Published • 27 -
Buffer of Thoughts: Thought-Augmented Reasoning with Large Language Models
Paper • 2406.04271 • Published • 27
Collections
Discover the best community collections!
Collections including paper arxiv:2403.04642
-
Quiet-STaR: Language Models Can Teach Themselves to Think Before Speaking
Paper • 2403.09629 • Published • 72 -
How Far Are We from Intelligent Visual Deductive Reasoning?
Paper • 2403.04732 • Published • 18 -
Teaching Large Language Models to Reason with Reinforcement Learning
Paper • 2403.04642 • Published • 46 -
GLoRe: When, Where, and How to Improve LLM Reasoning via Global and Local Refinements
Paper • 2402.10963 • Published • 9
-
ORPO: Monolithic Preference Optimization without Reference Model
Paper • 2403.07691 • Published • 62 -
sDPO: Don't Use Your Data All at Once
Paper • 2403.19270 • Published • 39 -
Teaching Large Language Models to Reason with Reinforcement Learning
Paper • 2403.04642 • Published • 46 -
Best Practices and Lessons Learned on Synthetic Data for Language Models
Paper • 2404.07503 • Published • 29
-
Pix2Gif: Motion-Guided Diffusion for GIF Generation
Paper • 2403.04634 • Published • 14 -
StableDrag: Stable Dragging for Point-based Image Editing
Paper • 2403.04437 • Published • 25 -
Teaching Large Language Models to Reason with Reinforcement Learning
Paper • 2403.04642 • Published • 46 -
Yi: Open Foundation Models by 01.AI
Paper • 2403.04652 • Published • 62
-
Teaching Large Language Models to Reason with Reinforcement Learning
Paper • 2403.04642 • Published • 46 -
Stop Regressing: Training Value Functions via Classification for Scalable Deep RL
Paper • 2403.03950 • Published • 13 -
RT-Sketch: Goal-Conditioned Imitation Learning from Hand-Drawn Sketches
Paper • 2403.02709 • Published • 7
-
Teaching Large Language Models to Reason with Reinforcement Learning
Paper • 2403.04642 • Published • 46 -
How Far Are We from Intelligent Visual Deductive Reasoning?
Paper • 2403.04732 • Published • 18 -
Common 7B Language Models Already Possess Strong Math Capabilities
Paper • 2403.04706 • Published • 16 -
DeepSeek-Prover: Advancing Theorem Proving in LLMs through Large-Scale Synthetic Data
Paper • 2405.14333 • Published • 34