AntGroup-Machine Intelligence

company

Activity Feed Request to join this org

AI & ML interests

None defined yet.

Recent Activity

zhangxgu authored a paper 8 days ago

UI-Venus Technical Report: Building High-performance UI Agents with RFT

zhangxgu authored a paper 8 days ago

XYLayoutLM: Towards Layout-Aware Multimodal Networks For Visually-Rich Document Understanding

zhangxgu authored a paper 8 days ago

DiffusionInst: Diffusion Model for Instance Segmentation

View all activity

zhangxgu

authored 4 papers 8 days ago

UI-Venus Technical Report: Building High-performance UI Agents with RFT

Paper • 2508.10833 • Published 12 days ago • 38

XYLayoutLM: Towards Layout-Aware Multimodal Networks For Visually-Rich Document Understanding

Paper • 2203.06947 • Published Mar 14, 2022

DiffusionInst: Diffusion Model for Instance Segmentation

Paper • 2212.02773 • Published Dec 6, 2022

Mobile User Interface Element Detection Via Adaptively Prompt Tuning

Paper • 2305.09699 • Published May 16, 2023

zhangxgu

authored 4 papers about 1 month ago

Backpropagation Path Search On Adversarial Transferability

Paper • 2308.07625 • Published Aug 15, 2023

DeMamba: AI-Generated Video Detection on Million-Scale GenVideo Benchmark

Paper • 2405.19707 • Published May 30, 2024 • 8

E-ANT: A Large-Scale Dataset for Efficient Automatic GUI NavigaTion

Paper • 2406.14250 • Published Jun 20, 2024

GUI-G$^2$: Gaussian Reward Modeling for GUI Grounding

Paper • 2507.15846 • Published Jul 21 • 131

sunshine-lwt

authored a paper 8 months ago

VideoRefer Suite: Advancing Spatial-Temporal Object Understanding with Video LLM

Paper • 2501.00599 • Published Dec 31, 2024 • 48

sunshine-lwt

authored a paper about 1 year ago

TokenPacker: Efficient Visual Projector for Multimodal LLM

Paper • 2407.02392 • Published Jul 2, 2024 • 24

coura

authored a paper about 1 year ago

TokenPacker: Efficient Visual Projector for Multimodal LLM

Paper • 2407.02392 • Published Jul 2, 2024 • 24

sunshine-lwt

updated a dataset over 1 year ago

AntGroup-MI/Osprey-724K

Preview • Updated Feb 5, 2024 • 118 • 14

sunshine-lwt

authored 3 papers over 1 year ago

Point2Mask: Point-supervised Panoptic Segmentation via Optimal Transport

Paper • 2308.01779 • Published Aug 3, 2023 • 1

H2RBox: Horizontal Box Annotation is All You Need for Oriented Object Detection

Paper • 2210.06742 • Published Oct 13, 2022 • 1

Osprey: Pixel Understanding with Visual Instruction Tuning

Paper • 2312.10032 • Published Dec 15, 2023 • 3