Gyanateet Dutta's picture

Gyanateet Dutta

Ryukijano

·

https://ryukijano.github.io

AI & ML interests

Computer Vision, Robotics, Generative modelling,ML in browser, healthcare applications, intersection of art and ML.

Recent Activity

upvoted a paper 3 days ago

Diffusion Knows Transparency: Repurposing Video Diffusion for Transparent Object Depth and Normal Estimation

upvoted a paper 4 days ago

mHC: Manifold-Constrained Hyper-Connections

upvoted a paper 5 days ago

Dream-VL & Dream-VLA: Open Vision-Language and Vision-Language-Action Models with Diffusion Language Model Backbone

View all activity

Organizations

upvoted a paper 3 days ago

Diffusion Knows Transparency: Repurposing Video Diffusion for Transparent Object Depth and Normal Estimation

Paper • 2512.23705 • Published 7 days ago • 43

upvoted a paper 4 days ago

mHC: Manifold-Constrained Hyper-Connections

Paper • 2512.24880 • Published 5 days ago • 201

upvoted a paper 5 days ago

Dream-VL & Dream-VLA: Open Vision-Language and Vision-Language-Action Models with Diffusion Language Model Backbone

Paper • 2512.22615 • Published 9 days ago • 43

upvoted an article 24 days ago

Article

Why You Should Care About Partial Differential Equations (PDEs)

25 days ago

•

35

upvoted a paper about 2 months ago

Spatial-SSRL: Enhancing Spatial Understanding via Self-Supervised Reinforcement Learning

Paper • 2510.27606 • Published Oct 31, 2025 • 28

upvoted a paper 2 months ago

π_RL: Online RL Fine-tuning for Flow-based Vision-Language-Action Models

Paper • 2510.25889 • Published Oct 29, 2025 • 65

upvoted a paper 3 months ago

Advancing End-to-End Pixel Space Generative Modeling via Self-supervised Pre-training

Paper • 2510.12586 • Published Oct 14, 2025 • 108

upvoted an article 4 months ago

Article

SAIR: Accelerating Pharma R&D with AI-Powered Structural Intelligence

Sep 2, 2025

•

36

upvoted 4 papers 4 months ago

Discrete Diffusion VLA: Bringing Discrete Diffusion to Action Decoding in Vision-Language-Action Policies

Paper • 2508.20072 • Published Aug 27, 2025 • 31

Pixie: Fast and Generalizable Supervised Learning of 3D Physics from Pixels

Paper • 2508.17437 • Published Aug 20, 2025 • 38

MeshCoder: LLM-Powered Structured Mesh Code Generation from Point Clouds

Paper • 2508.14879 • Published Aug 20, 2025 • 68

Do What? Teaching Vision-Language-Action Models to Reject the Impossible

Paper • 2508.16292 • Published Aug 22, 2025 • 9

upvoted a collection 5 months ago

NVIDIA Nemotron V2

Open, Production-ready Enterprise Models. Nvidia Open Model license. • 9 items • Updated 13 days ago • 100

upvoted an article 5 months ago

Article

Introducing Pivotal Token Search (PTS): Targeting Critical Decision Points in LLM Training

May 17, 2025

•

11

upvoted a paper 5 months ago

Reinforcement Learning in Vision: A Survey

Paper • 2508.08189 • Published Aug 11, 2025 • 29

upvoted a collection 5 months ago

The Well

A 15TB collection of physics simulation datasets. • 18 items • Updated Mar 24, 2025 • 41

upvoted a paper 5 months ago

Seed Diffusion: A Large-Scale Diffusion Language Model with High-Speed Inference

Paper • 2508.02193 • Published Aug 4, 2025 • 133

upvoted an article 5 months ago

Article

🪆 Introduction to Matryoshka Embedding Models

+1

Feb 23, 2024

•

185

upvoted a collection 5 months ago

Cosmos-Transfer1-DiffusionRenderer

High-quality video de-lighting and re-lighting based on Cosmos video diffusion framework • 2 items • Updated Oct 2, 2025 • 2

upvoted a paper 6 months ago

GenRecal: Generation after Recalibration from Large to Small Vision-Language Models

Paper • 2506.15681 • Published Jun 18, 2025 • 39