Scaling up Test-Time Compute with Latent Reasoning: A Recurrent Depth Approach Paper • 2502.05171 • Published Feb 7 • 142
Transformers Can Do Arithmetic with the Right Embeddings Paper • 2405.17399 • Published May 27, 2024 • 55
Spotting LLMs With Binoculars: Zero-Shot Detection of Machine-Generated Text Paper • 2401.12070 • Published Jan 22, 2024 • 45
Diffusion Art or Digital Forgery? Investigating Data Replication in Diffusion Models Paper • 2212.03860 • Published Dec 7, 2022 • 1
Hard Prompts Made Easy: Gradient-Based Discrete Optimization for Prompt Tuning and Discovery Paper • 2302.03668 • Published Feb 7, 2023 • 1
NEFTune: Noisy Embeddings Improve Instruction Finetuning Paper • 2310.05914 • Published Oct 9, 2023 • 14
Cramming: Training a Language Model on a Single GPU in One Day Paper • 2212.14034 • Published Dec 28, 2022
Bring Your Own Data! Self-Supervised Evaluation for Large Language Models Paper • 2306.13651 • Published Jun 23, 2023 • 15
On the Reliability of Watermarks for Large Language Models Paper • 2306.04634 • Published Jun 7, 2023 • 5
Understanding and Mitigating Copying in Diffusion Models Paper • 2305.20086 • Published May 31, 2023 • 3
Tree-Ring Watermarks: Fingerprints for Diffusion Images that are Invisible and Robust Paper • 2305.20030 • Published May 31, 2023 • 8