Yuhui Xu's picture

4 13 6

Yuhui Xu

yuhuixu

·

https://yuhuixu1993.github.io/

yuhuixu1993

AI & ML interests

None yet

Recent Activity

upvoted a paper about 1 month ago

GTA1: GUI Test-time Scaling Agent

upvoted a paper about 2 months ago

GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning with Scalable Reinforcement Learning

updated a model 3 months ago

Salesforce/E1-AceReason-14B

View all activity

Organizations

authored 4 papers 3 months ago

Fractured Chain-of-Thought Reasoning

Paper • 2505.12992 • Published May 19 • 22

Scaling Computer-Use Grounding via User Interface Decomposition and Synthesis

Paper • 2505.13227 • Published May 19 • 46

Beyond 'Aha!': Toward Systematic Meta-Abilities Alignment in Large Reasoning Models

Paper • 2505.10554 • Published May 15 • 120

Scalable Chain of Thoughts via Elastic Reasoning

Paper • 2505.05315 • Published May 8 • 26

authored 2 papers 7 months ago

QA-LoRA: Quantization-Aware Low-Rank Adaptation of Large Language Models

Paper • 2309.14717 • Published Sep 26, 2023 • 44

Reward-Guided Speculative Decoding for Efficient LLM Reasoning

Paper • 2501.19324 • Published Jan 31 • 40

authored a paper 11 months ago

MathHay: An Automated Benchmark for Long-Context Mathematical Reasoning in LLMs

Paper • 2410.04698 • Published Oct 7, 2024 • 13

authored 9 papers about 1 year ago

Latency-Aware Differentiable Neural Architecture Search

Paper • 2001.06392 • Published Jan 17, 2020

PC-DARTS: Partial Channel Connections for Memory-Efficient Architecture Search

Paper • 1907.05737 • Published Jul 12, 2019

Trained Rank Pruning for Efficient Deep Neural Networks

Paper • 1812.02402 • Published Dec 6, 2018 • 1

TRP: Trained Rank Pruning for Efficient Deep Neural Networks

Paper • 2004.14566 • Published Apr 30, 2020 • 1

Not All Experts are Equal: Efficient Expert Pruning and Skipping for Mixture-of-Experts Large Language Models

Paper • 2402.14800 • Published Feb 22, 2024 • 3

TerDiT: Ternary Diffusion Models with Transformers

Paper • 2405.14854 • Published May 23, 2024 • 2

SPP: Sparsity-Preserved Parameter-Efficient Fine-Tuning for Large Language Models

Paper • 2405.16057 • Published May 25, 2024

One QuantLLM for ALL: Fine-tuning Quantized LLMs Once for Efficient Deployments

Paper • 2405.20202 • Published May 30, 2024

ThinK: Thinner Key Cache by Query-Driven Pruning

Paper • 2407.21018 • Published Jul 30, 2024 • 33