Notes on Applicability of GPT-4 to Document Understanding Paper • 2405.18433 • Published May 28, 2024
Can Models Help Us Create Better Models? Evaluating LLMs as Data Scientists Paper • 2410.23331 • Published Oct 30, 2024 • 8
STable: Table Generation Framework for Encoder-Decoder Models Paper • 2206.04045 • Published Jun 8, 2022
Arctic-TILT. Business Document Understanding at Sub-Billion Scale Paper • 2408.04632 • Published Aug 8, 2024 • 1
BLOOM: A 176B-Parameter Open-Access Multilingual Language Model Paper • 2211.05100 • Published Nov 9, 2022 • 31
ZeRO: Memory Optimizations Toward Training Trillion Parameter Models Paper • 1910.02054 • Published Oct 4, 2019 • 6
ZeRO-Infinity: Breaking the GPU Memory Wall for Extreme Scale Deep Learning Paper • 2104.07857 • Published Apr 16, 2021
DeepSpeed-MoE: Advancing Mixture-of-Experts Inference and Training to Power Next-Generation AI Scale Paper • 2201.05596 • Published Jan 14, 2022 • 2
DeepSpeed4Science Initiative: Enabling Large-Scale Scientific Discovery through Sophisticated AI System Technologies Paper • 2310.04610 • Published Oct 6, 2023 • 1
Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone Paper • 2404.14219 • Published Apr 22, 2024 • 259
FP6-LLM: Efficiently Serving Large Language Models Through FP6-Centric Algorithm-System Co-Design Paper • 2401.14112 • Published Jan 25, 2024 • 21