Liu Xiaoran's picture

3 21 8

Liu Xiaoran

LiuXR

·

AI & ML interests

None yet

Recent Activity

commented on a paper 3 days ago

LongLLaDA: Unlocking Long Context Capabilities in Diffusion LLMs

liked a model 4 days ago

POLARIS-Project/Polaris-4B-Preview

liked a model 5 days ago

ziweihe/fourier-transformer-cnndm

View all activity

Organizations

None yet

upvoted a paper 7 days ago

LongLLaDA: Unlocking Long Context Capabilities in Diffusion LLMs

Paper • 2506.14429 • Published 8 days ago • 43

upvoted a paper 8 days ago

MiniMax-M1: Scaling Test-Time Compute Efficiently with Lightning Attention

Paper • 2506.13585 • Published 9 days ago • 234

upvoted 5 papers 9 days ago

InternLM2 Technical Report

Paper • 2403.17297 • Published Mar 26, 2024 • 34

CoLLiE: Collaborative Training of Large Language Models in an Efficient Way

Paper • 2312.00407 • Published Dec 1, 2023 • 3

DetectiveQA: Evaluating Long-Context Reasoning on Detective Novels

Paper • 2409.02465 • Published Sep 4, 2024 • 1

LongWanjuan: Towards Systematic Measurement for Long Text Quality

Paper • 2402.13583 • Published Feb 21, 2024 • 1

Beyond Homogeneous Attention: Memory-Efficient LLMs via Fourier-Approximated KV Cache

Paper • 2506.11886 • Published 12 days ago • 20

upvoted a paper about 1 month ago

ConvSearch-R1: Enhancing Query Reformulation for Conversational Search with Reasoning via Reinforcement Learning

Paper • 2505.15776 • Published May 21 • 10

upvoted a collection 3 months ago

VideoRoPE: What Makes for Good Video Rotary Position Embeddi

A storage repo for VideoRoPE. • 6 items • Updated 8 days ago • 3

upvoted 8 papers 3 months ago

Farewell to Length Extrapolation, a Training-Free Infinite Context with Finite Attention Scope

Paper • 2407.15176 • Published Jul 21, 2024 • 3

Scaling Laws of RoPE-based Extrapolation

Paper • 2310.05209 • Published Oct 8, 2023 • 8

A Comprehensive Survey on Long Context Language Modeling

Paper • 2503.17407 • Published Mar 20 • 49

DuoDecoding: Hardware-aware Heterogeneous Speculative Decoding with Dynamic Multi-Sequence Drafting

Paper • 2503.00784 • Published Mar 2 • 13

RWKV-7 "Goose" with Expressive Dynamic State Evolution

Paper • 2503.14456 • Published Mar 18 • 149

Towards Economical Inference: Enabling DeepSeek's Multi-Head Latent Attention in Any Transformer-based LLMs

Paper • 2502.14837 • Published Feb 20 • 4

LongRoPE2: Near-Lossless LLM Context Window Scaling

Paper • 2502.20082 • Published Feb 27 • 39

World Modeling Makes a Better Planner: Dual Preference Optimization for Embodied Task Planning

Paper • 2503.10480 • Published Mar 13 • 54

upvoted 3 papers 4 months ago

Visual-RFT: Visual Reinforcement Fine-Tuning

Paper • 2503.01785 • Published Mar 3 • 80

Thus Spake Long-Context Large Language Model

Paper • 2502.17129 • Published Feb 24 • 73

VideoRoPE: What Makes for Good Video Rotary Position Embedding?

Paper • 2502.05173 • Published Feb 7 • 65