Submitted by akhaliq 100 SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training · 9 authors 6
Submitted by akhaliq 32 Optimizing Large Language Model Training Using FP4 Quantization · 8 authors 2
Submitted by akhaliq 23 Over-Tokenized Transformer: Vocabulary is Generally Worth Scaling · 7 authors 4
Submitted by paulpanwang 21 DiffSplat: Repurposing Image Diffusion Models for Scalable Gaussian Splat Generation · 5 authors 3
Submitted by akhaliq 7 Low-Rank Adapters Meet Neural Architecture Search for LLM Compression · 3 authors 2
Submitted by amanchadha 6 IndicMMLU-Pro: Benchmarking Indic Large Language Models on Multi-Task Language Understanding · 7 authors 2
Submitted by akhaliq 4 TAID: Temporally Adaptive Interpolated Distillation for Efficient Knowledge Transfer in Language Models · 5 authors 2
Submitted by iproskurina 3 Histoires Morales: A French Dataset for Assessing Moral Alignment · 7 authors 2