James Chang

strategist922

strategist922

AI & ML interests

Multimodal Learning

Recent Activity

upvoted a paper about 1 hour ago

Vision as a Dialect: Unifying Visual Understanding and Generation via Text-Aligned Representations

liked a model about 1 hour ago

csuhan/TA-Tok

liked a model about 2 hours ago

Qwen/Qwen-Image

View all activity

Organizations

None yet

upvoted a paper about 1 hour ago

Vision as a Dialect: Unifying Visual Understanding and Generation via Text-Aligned Representations

Paper • 2506.18898 • Published Jun 23 • 33

upvoted 2 papers 7 days ago

Llama-Nemotron: Efficient Reasoning Models

Paper • 2505.00949 • Published May 2 • 42

KAT-V1: Kwai-AutoThink Technical Report

Paper • 2507.08297 • Published 25 days ago • 5

upvoted a paper 12 days ago

One Token to Fool LLM-as-a-Judge

Paper • 2507.08794 • Published 24 days ago • 31

upvoted a paper 20 days ago

When Semantics Mislead Vision: Mitigating Large Multimodal Models Hallucinations in Scene Text Spotting and Understanding

Paper • 2506.05551 • Published Jun 5 • 5

upvoted an article 26 days ago

Article

SmolLM3: smol, multilingual, long-context reasoner

and 22 others •

28 days ago

• 606

upvoted a paper 26 days ago

Rope to Nope and Back Again: A New Hybrid Attention Strategy

Paper • 2501.18795 • Published Jan 30 • 6

upvoted a collection about 2 months ago

dots.llm1

Collection

2 items • Updated Jun 11 • 17

upvoted 4 papers 3 months ago

AgentGym: Evolving Large Language Model-based Agents across Diverse Environments

Paper • 2406.04151 • Published Jun 6, 2024 • 23

Agent-FLAN: Designing Data and Methods of Effective Agent Tuning for Large Language Models

Paper • 2403.12881 • Published Mar 19, 2024 • 18

AgentTuning: Enabling Generalized Agent Abilities for LLMs

Paper • 2310.12823 • Published Oct 19, 2023 • 36

SoundStorm: Efficient Parallel Audio Generation

Paper • 2305.09636 • Published May 16, 2023 • 13

upvoted 8 papers 4 months ago

Redefining Temporal Modeling in Video Diffusion: The Vectorized Timestep Approach

Paper • 2410.03160 • Published Oct 4, 2024 • 5

Expanding Performance Boundaries of Open-Source Multimodal Models with Model, Data, and Test-Time Scaling

Paper • 2412.05271 • Published Dec 6, 2024 • 161

ReLU^2 Wins: Discovering Efficient Activation Functions for Sparse LLMs

Paper • 2402.03804 • Published Feb 6, 2024 • 4

PowerInfer: Fast Large Language Model Serving with a Consumer-grade GPU

Paper • 2312.12456 • Published Dec 16, 2023 • 45

ProSparse: Introducing and Enhancing Intrinsic Activation Sparsity within Large Language Models

Paper • 2402.13516 • Published Feb 21, 2024 • 1

DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model

Paper • 2405.04434 • Published May 7, 2024 • 22

Neural Machine Translation by Jointly Learning to Align and Translate

Paper • 1409.0473 • Published Sep 1, 2014 • 7

Effective Approaches to Attention-based Neural Machine Translation

Paper • 1508.04025 • Published Aug 17, 2015 • 3

James Chang

AI & ML interests

Recent Activity

Organizations

strategist922's activity

SmolLM3: smol, multilingual, long-context reasoner