Training Day
updated
LucasThil/randomized_clean_miniwob_episodes__image0_5000_v2
Viewer
•
Updated
•
2.5k
•
145
LucasThil/miniwob_plusplus_hierarchical_training_actions_drain
Viewer
•
Updated
•
40.2k
•
18
•
1
DSO: Aligning 3D Generators with Simulation Feedback for Physical
Soundness
Paper
•
2503.22677
•
Published
•
5
MeshCraft: Exploring Efficient and Controllable Mesh Generation with
Flow-based DiTs
Paper
•
2503.23022
•
Published
•
6
SynWorld: Virtual Scenario Synthesis for Agentic Action Knowledge
Refinement
Paper
•
2504.03561
•
Published
•
18
Scaling Autonomous Agents via Automatic Reward Modeling And Planning
Paper
•
2502.12130
•
Published
•
2
A Real-World WebAgent with Planning, Long Context Understanding, and
Program Synthesis
Paper
•
2307.12856
•
Published
•
36
SmolVLM: Redefining small and efficient multimodal models
Paper
•
2504.05299
•
Published
•
202
Personalize Anything for Free with Diffusion Transformer
Paper
•
2503.12590
•
Published
•
44
Compositional Foundation Models for Hierarchical Planning
Paper
•
2309.08587
•
Published
•
11
Q-Transformer: Scalable Offline Reinforcement Learning via
Autoregressive Q-Functions
Paper
•
2309.10150
•
Published
•
25
Absolute Zero: Reinforced Self-play Reasoning with Zero Data
Paper
•
2505.03335
•
Published
•
188
ProRL: Prolonged Reinforcement Learning Expands Reasoning Boundaries in
Large Language Models
Paper
•
2505.24864
•
Published
•
143
Learning to Reason under Off-Policy Guidance
Paper
•
2504.14945
•
Published
•
88
Physics of Language Models: Part 1, Context-Free Grammar
Paper
•
2305.13673
•
Published
•
7
Aligning Latent Spaces with Flow Priors
Paper
•
2506.05240
•
Published
•
27
Hunyuan3D 2.5: Towards High-Fidelity 3D Assets Generation with Ultimate
Details
Paper
•
2506.16504
•
Published
•
31
AlphaGo Moment for Model Architecture Discovery
Paper
•
2507.18074
•
Published
•
1
CODA: Coordinating the Cerebrum and Cerebellum for a Dual-Brain Computer
Use Agent with Decoupled Reinforcement Learning
Paper
•
2508.20096
•
Published
•
36
AgentGym-RL: Training LLM Agents for Long-Horizon Decision Making
through Multi-Turn Reinforcement Learning
Paper
•
2509.08755
•
Published
•
56
LazyDrag: Enabling Stable Drag-Based Editing on Multi-Modal Diffusion
Transformers via Explicit Correspondence
Paper
•
2509.12203
•
Published
•
19
SynCircuit: Automated Generation of New Synthetic RTL Circuits Can Enable Big Data in Circuits
Paper
•
2509.00071
•
Published
Chunked TabPFN: Exact Training-Free In-Context Learning for Long-Context
Tabular Data
Paper
•
2509.00326
•
Published