ramune

ramu0e

AI & ML interests

robotics, image generation, RL, Foundation Models,

Recent Activity

liked a model 15 days ago

qwbu/univla-7b

liked a dataset 17 days ago

x-humanoid-robomind/RoboMIND

liked a Space 20 days ago

lerobot/visualize_dataset

View all activity

Organizations

upvoted a paper about 1 month ago

JARVIS-VLA: Post-Training Large-Scale Vision Language Models to Play Visual Games with Keyboards and Mouse

Paper • 2503.16365 • Published Mar 20 • 41

upvoted a collection 4 months ago

OpenX-LeRobot

Collection

Open X-Embodiment datasets in LeRobot format with standard transfomation (https://github.com/Tavish9/any4lerobot) • 34 items • Updated 6 days ago • 14

upvoted 2 papers 5 months ago

Large Language Diffusion Models

Paper • 2502.09992 • Published Feb 14 • 120

Magma: A Foundation Model for Multimodal AI Agents

Paper • 2502.13130 • Published Feb 18 • 58

upvoted an article 5 months ago

Article

π0 and π0-FAST: Vision-Language-Action Models for General Robot Control

and 3 others •

Feb 4

• 167

upvoted a paper 5 months ago

Distillation Scaling Laws

Paper • 2502.08606 • Published Feb 12 • 49

upvoted a paper 6 months ago

Upscale-A-Video: Temporal-Consistent Diffusion Model for Real-World Video Super-Resolution

Paper • 2312.06640 • Published Dec 11, 2023 • 48

upvoted a collection 6 months ago

Cosmos

Collection

The collection of Cosmos models • 31 items • Updated 3 days ago • 292

upvoted 2 papers 7 months ago

PLEX: Making the Most of the Available Data for Robotic Manipulation Pretraining

Paper • 2303.08789 • Published Mar 15, 2023 • 1

Star Attention: Efficient LLM Inference over Long Sequences

Paper • 2411.17116 • Published Nov 26, 2024 • 55

upvoted 10 papers 8 months ago

MagicDriveDiT: High-Resolution Long Video Generation for Autonomous Driving with Adaptive Control

Paper • 2411.13807 • Published Nov 21, 2024 • 11

BALROG: Benchmarking Agentic LLM and VLM Reasoning On Games

Paper • 2411.13543 • Published Nov 20, 2024 • 18

How Far is Video Generation from World Model: A Physical Law Perspective

Paper • 2411.02385 • Published Nov 4, 2024 • 35

TIP-I2V: A Million-Scale Real Text and Image Prompt Dataset for Image-to-Video Generation

Paper • 2411.04709 • Published Nov 5, 2024 • 27

BitNet a4.8: 4-bit Activations for 1-bit LLMs

Paper • 2411.04965 • Published Nov 7, 2024 • 69

JanusFlow: Harmonizing Autoregression and Rectified Flow for Unified Multimodal Understanding and Generation

Paper • 2411.07975 • Published Nov 12, 2024 • 31

ramune

AI & ML interests

Recent Activity

Organizations

ramu0e's activity

π0 and π0-FAST: Vision-Language-Action Models for General Robot Control