Songyang Zhang's picture

Songyang Zhang

zsytony

·

AI & ML interests

None yet

Recent Activity

updated a model 1 day ago

opencompass/CompassJudger-2-7B-Instruct

published a model 1 day ago

opencompass/CompassJudger-2-7B-Instruct

updated a model 2 days ago

opencompass/CompassJudger-2-32B-Instruct

View all activity

Organizations

authored 3 papers 2 days ago

PM4Bench: A Parallel Multilingual Multi-Modal Multi-task Benchmark for Large Vision Language Model

Paper • 2503.18484 • Published Mar 24

Coding Triangle: How Does Large Language Model Understand Code?

Paper • 2507.06138 • Published 4 days ago • 18

Rethinking Verification for LLM Code Generation: From Generation to Testing

Paper • 2507.06920 • Published 3 days ago • 26

authored a paper about 2 months ago

Deciphering Trajectory-Aided LLM Reasoning: An Optimization Perspective

Paper • 2505.19815 • Published May 26 • 37

authored 3 papers 5 months ago

LiT: Delving into a Simplified Linear Diffusion Transformer for Image Generation

Paper • 2501.12976 • Published Jan 22

HuixiangDou: Overcoming Group Chat Scenarios with LLM-based Technical Assistance

Paper • 2401.08772 • Published Jan 16, 2024 • 1

Exploring the Limit of Outcome Reward for Learning Mathematical Reasoning

Paper • 2502.06781 • Published Feb 10 • 61

authored a paper 6 months ago

Condor: Enhance LLM Alignment with Knowledge-Driven Data Synthesis and Refinement

Paper • 2501.12273 • Published Jan 21 • 14

authored a paper 7 months ago

Are Your LLMs Capable of Stable Reasoning?

Paper • 2412.13147 • Published Dec 17, 2024 • 95

authored 2 papers 9 months ago

CompassJudger-1: All-in-one Judge Model Helps Model Evaluation and Evolution

Paper • 2410.16256 • Published Oct 21, 2024 • 61

ProSA: Assessing and Understanding the Prompt Sensitivity of LLMs

Paper • 2410.12405 • Published Oct 16, 2024 • 13

authored 5 papers 10 months ago

InternLM-Law: An Open Source Chinese Legal Large Language Model

Paper • 2406.14887 • Published Jun 21, 2024

MathBench: Evaluating the Theory and Application Proficiency of LLMs with a Hierarchical Mathematics Benchmark

Paper • 2405.12209 • Published May 20, 2024

CIBench: Evaluating Your LLMs with a Code Interpreter Plugin

Paper • 2407.10499 • Published Jul 15, 2024

UrBench: A Comprehensive Benchmark for Evaluating Large Multimodal Models in Multi-View Urban Scenarios

Paper • 2408.17267 • Published Aug 30, 2024 • 24

HelloBench: Evaluating Long Text Generation Capabilities of Large Language Models

Paper • 2409.16191 • Published Sep 24, 2024 • 43

authored 4 papers 11 months ago

RIFormer: Keep Your Vision Backbone Effective While Removing Token Mixer

Paper • 2304.05659 • Published Apr 12, 2023

InternLM-XComposer: A Vision-Language Large Model for Advanced Text-image Comprehension and Composition

Paper • 2309.15112 • Published Sep 26, 2023 • 2

BotChat: Evaluating LLMs' Capabilities of Having Multi-Turn Dialogues

Paper • 2310.13650 • Published Oct 20, 2023

Fake Alignment: Are LLMs Really Aligned Well?

Paper • 2311.05915 • Published Nov 10, 2023 • 2