Zhen Fang's picture

2 17 1

Zhen Fang

CostaliyA

·

https://costaliya.github.io/

CostaliyA

AI & ML interests

None yet

Recent Activity

updated a model 14 days ago

CostaliyA/cogact_actionly

published a model 15 days ago

CostaliyA/cogact_actionly

upvoted a paper 23 days ago

RoboTwin 2.0: A Scalable Data Generator and Benchmark with Strong Domain Randomization for Robust Bimanual Robotic Manipulation

View all activity

Organizations

None yet

upvoted a paper 23 days ago

RoboTwin 2.0: A Scalable Data Generator and Benchmark with Strong Domain Randomization for Robust Bimanual Robotic Manipulation

Paper • 2506.18088 • Published 27 days ago • 17

upvoted a paper about 1 month ago

CRITICTOOL: Evaluating Self-Critique Capabilities of Large Language Models in Tool-Calling Error Scenarios

Paper • 2506.13977 • Published Jun 11 • 10

upvoted a paper about 2 months ago

VRAG-RL: Empower Vision-Perception-Based RAG for Visually Rich Information Understanding via Iterative Reasoning with Reinforcement Learning

Paper • 2505.22019 • Published May 28 • 11

upvoted 5 papers 3 months ago

In-Context Edit: Enabling Instructional Image Editing with In-Context Generation in Large Scale Diffusion Transformer

Paper • 2504.20690 • Published Apr 29 • 20

VisuLogic: A Benchmark for Evaluating Visual Reasoning in Multi-modal Large Language Models

Paper • 2504.15279 • Published Apr 21 • 75

VCR-Bench: A Comprehensive Evaluation Framework for Video Chain-of-Thought Reasoning

Paper • 2504.07956 • Published Apr 10 • 47

OmniSVG: A Unified Scalable Vector Graphics Generation Model

Paper • 2504.06263 • Published Apr 8 • 172

Tuning-Free Image Editing with Fidelity and Editability via Unified Latent Diffusion Model

Paper • 2504.05594 • Published Apr 8 • 12

upvoted 4 papers 4 months ago

Long-Context Autoregressive Video Modeling with Next-Frame Prediction

Paper • 2503.19325 • Published Mar 25 • 73

RoboFactory: Exploring Embodied Agent Collaboration with Compositional Constraints

Paper • 2503.16408 • Published Mar 20 • 41

Edit Transfer: Learning Image Editing via Vision In-Context Relations

Paper • 2503.13327 • Published Mar 17 • 29

MedAgentsBench: Benchmarking Thinking Models and Agent Frameworks for Complex Medical Reasoning

Paper • 2503.07459 • Published Mar 10 • 16

upvoted 2 papers 5 months ago

ViDoRAG: Visual Document Retrieval-Augmented Generation via Dynamic Iterative Reasoning Agents

Paper • 2502.18017 • Published Feb 25 • 20

Logic-RL: Unleashing LLM Reasoning with Rule-Based Reinforcement Learning

Paper • 2502.14768 • Published Feb 20 • 48

upvoted a paper 8 months ago

ROICtrl: Boosting Instance Control for Visual Generation

Paper • 2411.17949 • Published Nov 27, 2024 • 88

upvoted a paper 9 months ago

Harnessing Webpage UIs for Text-Rich Visual Understanding

Paper • 2410.13824 • Published Oct 17, 2024 • 32

upvoted a paper over 1 year ago

MAG-Edit: Localized Image Editing in Complex Scenarios via Mask-Based Attention-Adjusted Guidance

Paper • 2312.11396 • Published Dec 18, 2023 • 11