1 10 4

Xiaobo Wang

Yofuria

https://github.com/Yofuria

Yofuria

AI & ML interests

Natural Language Processing

Recent Activity

upvoted a paper 28 days ago

RuleReasoner: Reinforced Rule-based Reasoning via Domain-aware Dynamic Sampling

updated a dataset about 1 month ago

Yofuria/llama3-ultrafeedback-armorm-swapped-40

published a dataset about 1 month ago

Yofuria/llama3-ultrafeedback-armorm-swapped-40

View all activity

Organizations

upvoted a paper 28 days ago

RuleReasoner: Reinforced Rule-based Reasoning via Domain-aware Dynamic Sampling

Paper • 2506.08672 • Published 29 days ago • 31

upvoted a paper about 1 month ago

ReflectEvo: Improving Meta Introspection of Small LLMs by Learning Self-Reflection

Paper • 2505.16475 • Published May 22 • 2

upvoted a paper about 2 months ago

Seek in the Dark: Reasoning via Test-Time Instance-Level Policy Gradient in Latent Space

Paper • 2505.13308 • Published May 19 • 26

upvoted a paper 3 months ago

OmniMMI: A Comprehensive Multi-modal Interaction Benchmark in Streaming Video Contexts

Paper • 2503.22952 • Published Mar 29 • 18

upvoted a paper 4 months ago

From Hours to Minutes: Lossless Acceleration of Ultra Long Sequence Generation up to 100K Tokens

Paper • 2502.18890 • Published Feb 26 • 30

upvoted an article 4 months ago

Article

Illustrating Reinforcement Learning from Human Feedback (RLHF)

and 3 others •

Dec 9, 2022

• 294

upvoted a paper 10 months ago

VideoLLaMB: Long-context Video Understanding with Recurrent Memory Bridges

Paper • 2409.01071 • Published Sep 2, 2024 • 28

upvoted 3 papers about 1 year ago

VideoHallucer: Evaluating Intrinsic and Extrinsic Hallucinations in Large Video-Language Models

Paper • 2406.16338 • Published Jun 24, 2024 • 27

RAM: Towards an Ever-Improving Memory System by Learning from Communications

Paper • 2404.12045 • Published Apr 18, 2024 • 2

In-Context Editing: Learning Knowledge from Self-Induced Distributions

Paper • 2406.11194 • Published Jun 17, 2024 • 15

Xiaobo Wang

AI & ML interests

Recent Activity

Organizations

Yofuria's activity

Illustrating Reinforcement Learning from Human Feedback (RLHF)