Yang Liu's picture

6 1

Yang Liu

yliu-cs

·

https://yliu-cs.github.io

yliu-cs

AI & ML interests

Multi-Modal Learning

Recent Activity

upvoted a paper 5 days ago

Paper2Poster: Towards Multimodal Poster Automation from Scientific Papers

upvoted a paper 6 days ago

HoliTom: Holistic Token Merging for Fast Video Large Language Models

authored a paper 12 days ago

PiTe: Pixel-Temporal Alignment for Large Video-Language Model

View all activity

Organizations

None yet

yliu-cs's activity

upvoted a paper 5 days ago

Paper2Poster: Towards Multimodal Poster Automation from Scientific Papers

Paper • 2505.21497 • Published 6 days ago • 90

upvoted a paper 6 days ago

HoliTom: Holistic Token Merging for Fast Video Large Language Models

Paper • 2505.21334 • Published 6 days ago • 18

authored a paper 12 days ago

PiTe: Pixel-Temporal Alignment for Large Video-Language Model

Paper • 2409.07239 • Published Sep 11, 2024 • 15

authored a paper 13 days ago

SSR: Enhancing Depth Perception in Vision-Language Models via Rationale-Guided Spatial Reasoning

Paper • 2505.12448 • Published 15 days ago • 10

upvoted a paper 13 days ago

SSR: Enhancing Depth Perception in Vision-Language Models via Rationale-Guided Spatial Reasoning

Paper • 2505.12448 • Published 15 days ago • 10

updated a collection 13 days ago

SSR

Enhancing Depth Perception in Vision-Language Models via Rationale-Guided Spatial Reasoning • 5 items • Updated 9 days ago • 1

updated a dataset 13 days ago

yliu-cs/SSR-CoT

Viewer • Updated 13 days ago • 1.2M • 98 • 1

updated 2 models 13 days ago

yliu-cs/SSR-VLM-7B

Updated 13 days ago • 1

yliu-cs/SSR-MIDI-7B

Updated 13 days ago • 6 • 1

updated a collection 13 days ago

SSR

Enhancing Depth Perception in Vision-Language Models via Rationale-Guided Spatial Reasoning • 5 items • Updated 9 days ago • 1

published a model 13 days ago

yliu-cs/SSR-VLM-7B

Updated 13 days ago • 1

updated a collection 13 days ago

SSR

Enhancing Depth Perception in Vision-Language Models via Rationale-Guided Spatial Reasoning • 5 items • Updated 9 days ago • 1

published a model 13 days ago

yliu-cs/SSR-MIDI-7B

Updated 13 days ago • 6 • 1

published a dataset 13 days ago

yliu-cs/SSR-CoT

Viewer • Updated 13 days ago • 1.2M • 98 • 1

upvoted a paper 21 days ago

MiMo: Unlocking the Reasoning Potential of Language Model -- From Pretraining to Posttraining

Paper • 2505.07608 • Published 21 days ago • 77

upvoted a paper 9 months ago

PiTe: Pixel-Temporal Alignment for Large Video-Language Model

Paper • 2409.07239 • Published Sep 11, 2024 • 15

liked a model 10 months ago

theaiinstitute/theia-base-patch16-224-cdiv

Feature Extraction • Updated Jul 30, 2024 • 11.8k • 8

upvoted a collection 10 months ago

🪐 SmolLM

A series of smol LLMs: 135M, 360M and 1.7B. We release base and Instruct models as well as the training corpus and some WebGPU demos • 12 items • Updated 28 days ago • 227