8 5 2

Baohao Liao

baohao

https://baohaoliao.github.io/

AI & ML interests

NLP

Recent Activity

authored a paper about 2 months ago

Make Pre-trained Model Reversible: From Parameter to Memory Efficient Fine-Tuning

authored a paper about 2 months ago

ApiQ: Finetuning of 2-Bit Quantized Large Language Model

authored a paper about 2 months ago

Mask More and Mask Later: Efficient Pre-training of Masked Language Models by Disentangling the [MASK] Token

View all activity

Organizations

authored 6 papers about 2 months ago

Make Pre-trained Model Reversible: From Parameter to Memory Efficient Fine-Tuning

Paper • 2306.00477 • Published Jun 1, 2023 • 1

ApiQ: Finetuning of 2-Bit Quantized Large Language Model

Paper • 2402.05147 • Published Feb 7, 2024

Mask More and Mask Later: Efficient Pre-training of Masked Language Models by Disentangling the [MASK] Token

Paper • 2211.04898 • Published Nov 9, 2022

3-in-1: 2D Rotary Adaptation for Efficient Finetuning, Efficient Batching and Composability

Paper • 2409.00119 • Published Aug 28, 2024

Unilogit: Robust Machine Unlearning for LLMs Using Uniform-Target Self-Distillation

Paper • 2505.06027 • Published May 9 • 18

Lost at the Beginning of Reasoning

Paper • 2506.22058 • Published Jun 27 • 1

upvoted a paper about 2 months ago

Lost at the Beginning of Reasoning

Paper • 2506.22058 • Published Jun 27 • 1

New activity in deepseek-ai/DeepSeek-R1-0528-Qwen3-8B 2 months ago

Model collapse after SFT

#14 opened 3 months ago by

Banjiuyufen

upvoted a paper 3 months ago

Fractured Chain-of-Thought Reasoning

Paper • 2505.12992 • Published May 19 • 22

authored a paper 3 months ago

Fractured Chain-of-Thought Reasoning

Paper • 2505.12992 • Published May 19 • 22

upvoted a paper 3 months ago

Unilogit: Robust Machine Unlearning for LLMs Using Uniform-Target Self-Distillation

Paper • 2505.06027 • Published May 9 • 18

updated a dataset 5 months ago

baohao/rsd

Preview • Updated Mar 31 • 5

published a dataset 5 months ago

baohao/rsd

Preview • Updated Mar 31 • 5

upvoted an article 5 months ago

Article

Navigating the RLHF Landscape: From Policy Gradients to PPO, GAE, and DPO for LLM Alignment

•

Feb 11

• 58

New activity in Qwen/QwQ-32B 6 months ago

missing opening <think>

#4 opened 6 months ago by

chriswritescode

New activity in QuixiAI/DeepSeek-R1-AWQ 6 months ago

Deployment framework

#2 opened 7 months ago by

xro7

New activity in open-r1/README 6 months ago

[Experiment] Applying GRPO to DeepSeek-R1-Distill-Qwen-1.5B with LIMO

😎 🔥 22

#15 opened 6 months ago by

lewtun

commented a paper 7 months ago

Reward-Guided Speculative Decoding for Efficient LLM Reasoning

Paper • 2501.19324 • Published Jan 31 • 40 •

authored a paper 7 months ago

Reward-Guided Speculative Decoding for Efficient LLM Reasoning

Paper • 2501.19324 • Published Jan 31 • 40

upvoted a paper 7 months ago

Reward-Guided Speculative Decoding for Efficient LLM Reasoning

Paper • 2501.19324 • Published Jan 31 • 40

Baohao Liao

AI & ML interests

Recent Activity

Organizations

baohao's activity

Model collapse after SFT

Navigating the RLHF Landscape: From Policy Gradients to PPO, GAE, and DPO for LLM Alignment

missing opening <think>

Deployment framework

[Experiment] Applying GRPO to DeepSeek-R1-Distill-Qwen-1.5B with LIMO