xmxx
's Collections
Daily paper that worth reading in details later
updated
Paper
•
2402.13144
•
Published
•
97
Genie: Generative Interactive Environments
Paper
•
2402.15391
•
Published
•
72
Sora: A Review on Background, Technology, Limitations, and Opportunities
of Large Vision Models
Paper
•
2402.17177
•
Published
•
89
VisionLLaMA: A Unified LLaMA Interface for Vision Tasks
Paper
•
2403.00522
•
Published
•
47
Scaling Rectified Flow Transformers for High-Resolution Image Synthesis
Paper
•
2403.03206
•
Published
•
66
Stealing Part of a Production Language Model
Paper
•
2403.06634
•
Published
•
92
Gemma: Open Models Based on Gemini Research and Technology
Paper
•
2403.08295
•
Published
•
49
Fast High-Resolution Image Synthesis with Latent Adversarial Diffusion
Distillation
Paper
•
2403.12015
•
Published
•
68
Mixture-of-Depths: Dynamically allocating compute in transformer-based
language models
Paper
•
2404.02258
•
Published
•
106
Leave No Context Behind: Efficient Infinite Context Transformers with
Infini-attention
Paper
•
2404.07143
•
Published
•
110
Phi-3 Technical Report: A Highly Capable Language Model Locally on Your
Phone
Paper
•
2404.14219
•
Published
•
259
The Instruction Hierarchy: Training LLMs to Prioritize Privileged
Instructions
Paper
•
2404.13208
•
Published
•
40
LayerSkip: Enabling Early Exit Inference and Self-Speculative Decoding
Paper
•
2404.16710
•
Published
•
80
What matters when building vision-language models?
Paper
•
2405.02246
•
Published
•
104
RLHF Workflow: From Reward Modeling to Online RLHF
Paper
•
2405.07863
•
Published
•
71
Chameleon: Mixed-Modal Early-Fusion Foundation Models
Paper
•
2405.09818
•
Published
•
132
Reducing Transformer Key-Value Cache Size with Cross-Layer Attention
Paper
•
2405.12981
•
Published
•
33
To Believe or Not to Believe Your LLM
Paper
•
2406.02543
•
Published
•
35
ShareGPT4Video: Improving Video Understanding and Generation with Better
Captions
Paper
•
2406.04325
•
Published
•
76
Long Context Transfer from Language to Vision
Paper
•
2406.16852
•
Published
•
34
LongIns: A Challenging Long-context Instruction-based Exam for LLMs
Paper
•
2406.17588
•
Published
•
23
PaliGemma: A versatile 3B VLM for transfer
Paper
•
2407.07726
•
Published
•
71