michaelbenayoun/llama-2-tiny-4kv-heads-4layers-random Text Generation • Updated 3 days ago • 10.5k
michaelbenayoun/llama-2-tiny-4kv-heads-16layers-random Text Generation • Updated 9 days ago • 7.92k
Running 2.66k 2.66k The Ultra-Scale Playbook 🌌 The ultimate guide to training LLM on large GPU Clusters