interesting - a yujin731 Collection

yujin731 's Collections

finance

agent

med

S2

RL-math

Code

interesting

updated 2 days ago

Describe Anything: Detailed Localized Image and Video Captioning

Paper • 2504.16072 • Published Apr 22 • 61
EmbodiedCity: A Benchmark Platform for Embodied Agent in Real-world City Environment

Paper • 2410.09604 • Published Oct 12, 2024
Geospatial Mechanistic Interpretability of Large Language Models

Paper • 2505.03368 • Published May 6 • 9
Scenethesis: A Language and Vision Agentic Framework for 3D Scene Generation

Paper • 2505.02836 • Published May 5 • 7
Constructing a 3D Town from a Single Image

Paper • 2505.15765 • Published May 21 • 23
SpatialScore: Towards Unified Evaluation for Multimodal Spatial Understanding

Paper • 2505.17012 • Published May 22 • 12
Multi-SpatialMLLM: Multi-Frame Spatial Understanding with Multi-Modal Large Language Models

Paper • 2505.17015 • Published May 22 • 9
Visual Embodied Brain: Let Multimodal Large Language Models See, Think, and Control in Spaces

Paper • 2506.00123 • Published May 30 • 34
Point-MoE: Towards Cross-Domain Generalization in 3D Semantic Segmentation via Mixture-of-Experts

Paper • 2505.23926 • Published May 29 • 6
TaskCraft: Automated Generation of Agentic Tasks

Paper • 2506.10055 • Published 23 days ago • 31
Drag-and-Drop LLMs: Zero-Shot Prompt-to-Weights

Paper • 2506.16406 • Published 15 days ago • 113
RLPR: Extrapolating RLVR to General Domains without Verifiers

Paper • 2506.18254 • Published 11 days ago • 31
Fine-Grained Preference Optimization Improves Spatial Reasoning in VLMs

Paper • 2506.21656 • Published 8 days ago • 12
Does Math Reasoning Improve General LLM Capabilities? Understanding Transferability of LLM Reasoning

Paper • 2507.00432 • Published 3 days ago • 46