22 32 32

Rui Yang PRO

Ray2333

https://yangrui2015.github.io

YangRui2015

AI & ML interests

Deep Reinforcement Learning

Recent Activity

upvoted a paper about 1 month ago

ToolOrchestra: Elevating Intelligence via Efficient Model and Tool Orchestration

liked a model about 1 month ago

amandaa/AutoL2S-7b

upvoted a paper 2 months ago

MM-PoisonRAG: Disrupting Multimodal RAG with Local and Global Poisoning Attacks

View all activity

Organizations

commented a paper 2 months ago

MIRO: MultI-Reward cOnditioned pretraining improves T2I quality and efficiency

Paper • 2510.25897 • Published Oct 29, 2025 • 16 •

commented a paper 3 months ago

ERA: Transforming VLMs into Embodied Agents via Embodied Prior Learning and Online Reinforcement Learning

Paper • 2510.12693 • Published Oct 14, 2025 • 27 •

commented a paper 7 months ago

Visual Embodied Brain: Let Multimodal Large Language Models See, Think, and Control in Spaces

Paper • 2506.00123 • Published May 30, 2025 • 35 •

New activity in microsoft/GUI-Actor-Verifier-2B 7 months ago

Update README.md

#1 opened 7 months ago by

Ray2333

commented a paper 7 months ago

MiCRo: Mixture Modeling and Context-aware Routing for Personalized Preference Learning

Paper • 2505.24846 • Published May 30, 2025 • 15 •

commented a paper 8 months ago

Optimizing Chain-of-Thought Reasoners via Gradient Variance Minimization in Rejection Sampling and RL

Paper • 2505.02391 • Published May 5, 2025 • 25 •

New activity in Ray2333/GRM-Llama3.2-3B-rewardmodel-ft 8 months ago

Bug in readme implementation

#3 opened 8 months ago by

jvelja

New activity in microsoft/Magma-8B 10 months ago

generation_args in the example

❤️ 2

#10 opened 10 months ago by

Ray2333

New activity in EmbodiedBench/EB-Manipulation 11 months ago

Add dataset card

#1 opened 11 months ago by

nielsr

New activity in Ray2333/Gemma-2B-rewardmodel-baseline 11 months ago

trained dataset and fine-tuned method

#1 opened 11 months ago by

glgjss960

commented 2 papers 11 months ago

Rethinking Diverse Human Preference Learning through Principal Component Analysis

Paper • 2502.13131 • Published Feb 18, 2025 • 37 •

EmbodiedBench: Comprehensive Benchmarking Multi-modal Large Language Models for Vision-Driven Embodied Agents

Paper • 2502.09560 • Published Feb 13, 2025 • 35 •

New activity in Ray2333/GRM-Llama3.2-3B-rewardmodel-ft 11 months ago

Update default tokenization behavior to "longest" in README

#2 opened 11 months ago by

MichaelR207

New activity in Ray2333/GRM-Llama3.2-3B-rewardmodel-ft about 1 year ago

Model Size

#1 opened about 1 year ago by

szhang120

commented a paper about 1 year ago

DynaMath: A Dynamic Visual Benchmark for Evaluating Mathematical Reasoning Robustness of Vision Language Models

Paper • 2411.00836 • Published Oct 29, 2024 • 15 •

New activity in Ray2333/GRM-llama3-8B-sftreg about 1 year ago

Adding `safetensors` variant of this model

#3 opened about 1 year ago by

SFconvertbot

New activity in Ray2333/GRM-llama3-8B-sftreg over 1 year ago

Abnormally Large Memory Footprint?

#2 opened over 1 year ago by

RylanSchaeffer

Some weights of the model checkpoint at Ray2333/GRM-llama3-8B-sftreg were not used when initializing

#1 opened over 1 year ago by

RylanSchaeffer

New activity in Ray2333/gpt2-large-harmless-reward_model over 1 year ago

Load failed:There is no "pytorch_model.bin", how to load the model?

#3 opened over 1 year ago by

Hanlard

a bug when loading model

#2 opened over 1 year ago by

ssmmzz

Rui Yang PRO

AI & ML interests

Recent Activity

Organizations

Ray2333's activity

Update README.md

Bug in readme implementation

generation_args in the example

Add dataset card

trained dataset and fine-tuned method

Update default tokenization behavior to "longest" in README

Model Size

Adding `safetensors` variant of this model

Abnormally Large Memory Footprint?

Some weights of the model checkpoint at Ray2333/GRM-llama3-8B-sftreg were not used when initializing

Load failed:There is no "pytorch_model.bin", how to load the model?

a bug when loading model