Jinyeong Kim's picture

469 9

Jinyeong Kim

rubatoyeong

·

rubato-yeong

AI & ML interests

None yet

Recent Activity

upvoted a paper about 1 month ago

ReasonMed: A 370K Multi-Agent Generated Dataset for Advancing Medical Reasoning

upvoted a paper about 1 month ago

Decomposing MLP Activations into Interpretable Features via Semi-Nonnegative Matrix Factorization

upvoted a paper about 1 month ago

Revisit What You See: Disclose Language Prior in Vision Tokens for Efficient Guided Decoding of LVLMs

View all activity

Organizations

None yet

upvoted 3 papers about 1 month ago

ReasonMed: A 370K Multi-Agent Generated Dataset for Advancing Medical Reasoning

Paper • 2506.09513 • Published Jun 11 • 97

Decomposing MLP Activations into Interpretable Features via Semi-Nonnegative Matrix Factorization

Paper • 2506.10920 • Published Jun 12 • 6

Revisit What You See: Disclose Language Prior in Vision Tokens for Efficient Guided Decoding of LVLMs

Paper • 2506.09522 • Published Jun 11 • 20

upvoted a paper 2 months ago

Softpick: No Attention Sink, No Massive Activations with Rectified Softmax

Paper • 2504.20966 • Published Apr 29 • 32

upvoted 16 papers 3 months ago

CheXWorld: Exploring Image World Modeling for Radiograph Representation Learning

Paper • 2504.13820 • Published Apr 18 • 17

It's All Connected: A Journey Through Test-Time Memorization, Attentional Bias, Retention, and Online Optimization

Paper • 2504.13173 • Published Apr 17 • 19

Describe Anything: Detailed Localized Image and Video Captioning

Paper • 2504.16072 • Published Apr 22 • 61

NoisyRollout: Reinforcing Visual Reasoning with Data Augmentation

Paper • 2504.13055 • Published Apr 17 • 19

Perception Encoder: The best visual embeddings are not at the output of the network

Paper • 2504.13181 • Published Apr 17 • 34

Generate, but Verify: Reducing Hallucination in Vision-Language Models with Retrospective Resampling

Paper • 2504.13169 • Published Apr 17 • 39

FUSION: Fully Integration of Vision-Language Representations for Deep Cross-Modal Understanding

Paper • 2504.09925 • Published Apr 14 • 38

InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models

Paper • 2504.10479 • Published Apr 14 • 276

Towards Visual Text Grounding of Multimodal Large Language Model

Paper • 2504.04974 • Published Apr 7 • 16

Scaling Laws for Native Multimodal Models Scaling Laws for Native Multimodal Models

Paper • 2504.07951 • Published Apr 10 • 29

Self-Steering Language Models

Paper • 2504.07081 • Published Apr 9 • 18

DDT: Decoupled Diffusion Transformer

Paper • 2504.05741 • Published Apr 8 • 76

Clinical ModernBERT: An efficient and long context encoder for biomedical text

Paper • 2504.03964 • Published Apr 4 • 6

Concept Lancet: Image Editing with Compositional Representation Transplant

Paper • 2504.02828 • Published Apr 3 • 17

SmolVLM: Redefining small and efficient multimodal models

Paper • 2504.05299 • Published Apr 7 • 192

Sparse Autoencoders Learn Monosemantic Features in Vision-Language Models

Paper • 2504.02821 • Published Apr 3 • 11