Models
Datasets
Spaces
Docs
Enterprise
Pricing
Log In
Sign Up

Tongyao's picture

1 11 1

Tongyao PRO

tyzhu

·

tongyao-zhu

AI & ML interests

Natural Language Processing

Organizations

None yet

tyzhu 's collections 8

Law of Vision Representation in MLLMs

Paper • 2408.16357 • Published Aug 29, 2024 • 96

How Much Knowledge Can You Pack into a LoRA Adapter without Harming LLM?

Paper • 2502.14502 • Published Feb 20 • 91
From RAG to Memory: Non-Parametric Continual Learning for Large Language Models

Paper • 2502.14802 • Published Feb 20 • 13
When an LLM is apprehensive about its answers -- and when its uncertainty is justified

Paper • 2503.01688 • Published Mar 3 • 21

Rank1: Test-Time Compute for Reranking in Information Retrieval

Paper • 2502.18418 • Published Feb 25 • 28
TeleRAG: Efficient Retrieval-Augmented Generation Inference with Lookahead Retrieval

Paper • 2502.20969 • Published Feb 28 • 11

Language Models' Factuality Depends on the Language of Inquiry

Paper • 2502.17955 • Published Feb 25 • 34
How Much Do LLMs Hallucinate across Languages? On Multilingual Estimation of LLM Hallucination in the Wild

Paper • 2502.12769 • Published Feb 18 • 3

LongRoPE2: Near-Lossless LLM Context Window Scaling

Paper • 2502.20082 • Published Feb 27 • 39

Scale-Distribution Decoupling: Enabling Stable and Effective Training of Large Language Models

Paper • 2502.15499 • Published Feb 21 • 14
MLLMs Know Where to Look: Training-free Perception of Small Visual Details with Multimodal LLMs

Paper • 2502.17422 • Published Feb 24 • 7
The Lottery LLM Hypothesis, Rethinking What Abilities Should LLM Compression Preserve?

Paper • 2502.17535 • Published Feb 24 • 8
Scaling LLM Pre-training with Vocabulary Curriculum

Paper • 2502.17910 • Published Feb 25 • 1

Can Large Language Models Detect Errors in Long Chain-of-Thought Reasoning?

Paper • 2502.19361 • Published Feb 26 • 28
Linguistic Generalizability of Test-Time Scaling in Mathematical Reasoning

Paper • 2502.17407 • Published Feb 24 • 26
Small Models Struggle to Learn from Strong Reasoners

Paper • 2502.12143 • Published Feb 17 • 39
Language Models can Self-Improve at State-Value Estimation for Better Search

Paper • 2503.02878 • Published Mar 4 • 10

RetrievalAttention: Accelerating Long-Context LLM Inference via Vector Retrieval

Paper • 2409.10516 • Published Sep 16, 2024 • 44
Measuring and Enhancing Trustworthiness of LLMs in RAG through Grounded Attributions and Learning to Refuse

Paper • 2409.11242 • Published Sep 17, 2024 • 7
Promptriever: Instruction-Trained Retrievers Can Be Prompted Like Language Models

Paper • 2409.11136 • Published Sep 17, 2024 • 25
On the Diagram of Thought

Paper • 2409.10038 • Published Sep 16, 2024 • 14

Law of Vision Representation in MLLMs

Paper • 2408.16357 • Published Aug 29, 2024 • 96

LongRoPE2: Near-Lossless LLM Context Window Scaling

Paper • 2502.20082 • Published Feb 27 • 39

How Much Knowledge Can You Pack into a LoRA Adapter without Harming LLM?

Paper • 2502.14502 • Published Feb 20 • 91
From RAG to Memory: Non-Parametric Continual Learning for Large Language Models

Paper • 2502.14802 • Published Feb 20 • 13
When an LLM is apprehensive about its answers -- and when its uncertainty is justified

Paper • 2503.01688 • Published Mar 3 • 21

Scale-Distribution Decoupling: Enabling Stable and Effective Training of Large Language Models

Paper • 2502.15499 • Published Feb 21 • 14
MLLMs Know Where to Look: Training-free Perception of Small Visual Details with Multimodal LLMs

Paper • 2502.17422 • Published Feb 24 • 7
The Lottery LLM Hypothesis, Rethinking What Abilities Should LLM Compression Preserve?

Paper • 2502.17535 • Published Feb 24 • 8
Scaling LLM Pre-training with Vocabulary Curriculum

Paper • 2502.17910 • Published Feb 25 • 1

Rank1: Test-Time Compute for Reranking in Information Retrieval

Paper • 2502.18418 • Published Feb 25 • 28
TeleRAG: Efficient Retrieval-Augmented Generation Inference with Lookahead Retrieval

Paper • 2502.20969 • Published Feb 28 • 11

Can Large Language Models Detect Errors in Long Chain-of-Thought Reasoning?

Paper • 2502.19361 • Published Feb 26 • 28
Linguistic Generalizability of Test-Time Scaling in Mathematical Reasoning

Paper • 2502.17407 • Published Feb 24 • 26
Small Models Struggle to Learn from Strong Reasoners

Paper • 2502.12143 • Published Feb 17 • 39
Language Models can Self-Improve at State-Value Estimation for Better Search

Paper • 2503.02878 • Published Mar 4 • 10

Language Models' Factuality Depends on the Language of Inquiry

Paper • 2502.17955 • Published Feb 25 • 34
How Much Do LLMs Hallucinate across Languages? On Multilingual Estimation of LLM Hallucination in the Wild

Paper • 2502.12769 • Published Feb 18 • 3

RetrievalAttention: Accelerating Long-Context LLM Inference via Vector Retrieval

Paper • 2409.10516 • Published Sep 16, 2024 • 44
Measuring and Enhancing Trustworthiness of LLMs in RAG through Grounded Attributions and Learning to Refuse

Paper • 2409.11242 • Published Sep 17, 2024 • 7
Promptriever: Instruction-Trained Retrievers Can Be Prompted Like Language Models

Paper • 2409.11136 • Published Sep 17, 2024 • 25
On the Diagram of Thought

Paper • 2409.10038 • Published Sep 16, 2024 • 14

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs