view article Article Navigating the RLHF Landscape: From Policy Gradients to PPO, GAE, and DPO for LLM Alignment Feb 11 • 94
Matryoshka: Learning to Drive Black-Box LLMs with LLMs Paper • 2410.20749 • Published Oct 28, 2024 • 1
MLE-Smith: Scaling MLE Tasks with Automated Multi-Agent Pipeline Paper • 2510.07307 • Published Oct 8 • 5
Task-Specific Zero-shot Quantization-Aware Training for Object Detection Paper • 2507.16782 • Published Jul 22 • 9
Task-Specific Zero-shot Quantization-Aware Training for Object Detection Paper • 2507.16782 • Published Jul 22 • 9 • 1
Matryoshka: Learning to Drive Black-Box LLMs with LLMs Paper • 2410.20749 • Published Oct 28, 2024 • 1
MLE-Dojo: Interactive Environments for Empowering LLM Agents in Machine Learning Engineering Paper • 2505.07782 • Published May 12 • 19
Task-Specific Zero-shot Quantization-Aware Training for Object Detection Paper • 2507.16782 • Published Jul 22 • 9
MLE-Dojo: Interactive Environments for Empowering LLM Agents in Machine Learning Engineering Paper • 2505.07782 • Published May 12 • 19