zzzac
's Collections
TORead
updated
MegaScale: Scaling Large Language Model Training to More Than 10,000
GPUs
Paper
•
2402.15627
•
Published
•
35
Rainbow Teaming: Open-Ended Generation of Diverse Adversarial Prompts
Paper
•
2402.16822
•
Published
•
16
FuseChat: Knowledge Fusion of Chat Models
Paper
•
2402.16107
•
Published
•
37
Multi-LoRA Composition for Image Generation
Paper
•
2402.16843
•
Published
•
29
EMO: Emote Portrait Alive - Generating Expressive Portrait Videos with
Audio2Video Diffusion Model under Weak Conditions
Paper
•
2402.17485
•
Published
•
191
Evaluating Very Long-Term Conversational Memory of LLM Agents
Paper
•
2402.17753
•
Published
•
19
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits
Paper
•
2402.17764
•
Published
•
607
BitNet: Scaling 1-bit Transformers for Large Language Models
Paper
•
2310.11453
•
Published
•
96
V3D: Video Diffusion Models are Effective 3D Generators
Paper
•
2403.06738
•
Published
•
28
Stealing Part of a Production Language Model
Paper
•
2403.06634
•
Published
•
91
Algorithmic progress in language models
Paper
•
2403.05812
•
Published
•
18
Chronos: Learning the Language of Time Series
Paper
•
2403.07815
•
Published
•
47
Motion Mamba: Efficient and Long Sequence Motion Generation with
Hierarchical and Bidirectional Selective SSM
Paper
•
2403.07487
•
Published
•
14
FDGaussian: Fast Gaussian Splatting from Single Image via
Geometric-aware Diffusion Model
Paper
•
2403.10242
•
Published
•
11
PERL: Parameter Efficient Reinforcement Learning from Human Feedback
Paper
•
2403.10704
•
Published
•
58
E5-V: Universal Embeddings with Multimodal Large Language Models
Paper
•
2407.12580
•
Published
•
40
AgentPoison: Red-teaming LLM Agents via Poisoning Memory or Knowledge
Bases
Paper
•
2407.12784
•
Published
•
49
Spectra: A Comprehensive Study of Ternary, Quantized, and FP16 Language
Models
Paper
•
2407.12327
•
Published
•
78
PaliGemma: A versatile 3B VLM for transfer
Paper
•
2407.07726
•
Published
•
68
LLaVA-NeXT-Interleave: Tackling Multi-image, Video, and 3D in Large
Multimodal Models
Paper
•
2407.07895
•
Published
•
40
The Mamba in the Llama: Distilling and Accelerating Hybrid Models
Paper
•
2408.15237
•
Published
•
39
Diffusion Models Are Real-Time Game Engines
Paper
•
2408.14837
•
Published
•
123
Writing in the Margins: Better Inference Pattern for Long Context
Retrieval
Paper
•
2408.14906
•
Published
•
139
SwiftBrush v2: Make Your One-step Diffusion Model Better Than Its
Teacher
Paper
•
2408.14176
•
Published
•
61
Building and better understanding vision-language models: insights and
future directions
Paper
•
2408.12637
•
Published
•
124
LongVILA: Scaling Long-Context Visual Language Models for Long Videos
Paper
•
2408.10188
•
Published
•
51
LongWriter: Unleashing 10,000+ Word Generation from Long Context LLMs
Paper
•
2408.07055
•
Published
•
66
Gemma Scope: Open Sparse Autoencoders Everywhere All At Once on Gemma 2
Paper
•
2408.05147
•
Published
•
39
Transformer Explainer: Interactive Learning of Text-Generative Models
Paper
•
2408.04619
•
Published
•
156
LLaVA-OneVision: Easy Visual Task Transfer
Paper
•
2408.03326
•
Published
•
60
Language Model Can Listen While Speaking
Paper
•
2408.02622
•
Published
•
38
OpenDevin: An Open Platform for AI Software Developers as Generalist
Agents
Paper
•
2407.16741
•
Published
•
70
WavTokenizer: an Efficient Acoustic Discrete Codec Tokenizer for Audio
Language Modeling
Paper
•
2408.16532
•
Published
•
48
Law of Vision Representation in MLLMs
Paper
•
2408.16357
•
Published
•
93
NVLM: Open Frontier-Class Multimodal LLMs
Paper
•
2409.11402
•
Published
•
73