1 24 2

Chih-Kai Yang

zenyn

AI & ML interests

None yet

Recent Activity

commented on a paper 1 day ago

SAKURA: On the Multi-hop Reasoning of Large Audio-Language Models Based on Speech and Audio Information

published a dataset 1 day ago

SLLM-multi-hop/GenderQA

upvoted a paper 20 days ago

TreeHop: Generate and Filter Next Query Embeddings Efficiently for Multi-hop Question Answering

View all activity

Organizations

zenyn's activity

commented a paper 1 day ago

SAKURA: On the Multi-hop Reasoning of Large Audio-Language Models Based on Speech and Audio Information

Paper • 2505.13237 • Published 5 days ago •

published a dataset 1 day ago

SLLM-multi-hop/GenderQA

Viewer • Updated Jan 24 • 500 • 1

upvoted a paper 20 days ago

TreeHop: Generate and Filter Next Query Embeddings Efficiently for Multi-hop Question Answering

Paper • 2504.20114 • Published 27 days ago • 5

upvoted 17 papers about 1 month ago

Generate, but Verify: Reducing Hallucination in Vision-Language Models with Retrospective Resampling

Paper • 2504.13169 • Published Apr 17 • 39

Analyzing LLMs' Knowledge Boundary Cognition Across Languages Through the Lens of Internal Representations

Paper • 2504.13816 • Published Apr 18 • 17

Could Thinking Multilingually Empower LLM Reasoning?

Paper • 2504.11833 • Published Apr 16 • 28

EasyEdit2: An Easy-to-use Steering Framework for Editing Large Language Models

Paper • 2504.15133 • Published Apr 21 • 21

LeetCodeDataset: A Temporal Dataset for Robust Evaluation and Efficient Training of Code LLMs

Paper • 2504.14655 • Published Apr 20 • 19

Hogwild! Inference: Parallel LLM Generation via Concurrent Attention

Paper • 2504.06261 • Published Apr 8 • 110

A Sober Look at Progress in Language Model Reasoning: Pitfalls and Paths to Reproducibility

Paper • 2504.07086 • Published Apr 9 • 21

Missing Premise exacerbates Overthinking: Are Reasoning Models losing Critical Thinking Skill?

Paper • 2504.06514 • Published Apr 9 • 39

SAEs Can Improve Unlearning: Dynamic Sparse Autoencoder Guardrails for Precision Unlearning in LLMs

Paper • 2504.08192 • Published Apr 11 • 4

Do PhD-level LLMs Truly Grasp Elementary Addition? Probing Rule Learning vs. Memorization in Large Language Models

Paper • 2504.05262 • Published Apr 7 • 11

Genius: A Generalizable and Purely Unsupervised Self-Training Framework For Advanced Reasoning

Paper • 2504.08672 • Published Apr 11 • 55

How Instruction and Reasoning Data shape Post-Training: Data Quality through the Lens of Layer-wise Gradients

Paper • 2504.10766 • Published Apr 14 • 40

ColorBench: Can VLMs See and Understand the Colorful World? A Comprehensive Benchmark for Color Perception, Reasoning, and Robustness

Paper • 2504.10514 • Published Apr 10 • 46

SIFT-50M: A Large-Scale Multilingual Dataset for Speech Instruction Fine-Tuning

Paper • 2504.09081 • Published Apr 12 • 17