Upscaled models using the Block Expansion method. Unlike the more common DUP Scaling, BE doesn't require fine-tuning to recover lost performance.
-
Pretergeek/OpenChat-3.5-0106_8.11B_36Layers-Interleaved
Text Generation • Updated • 32 • 2 -
Pretergeek/OpenChat-3.5-0106_8.99B_40Layers-Interleaved
Text Generation • Updated • 27 • 2 -
Pretergeek/OpenChat-3.5-0106_10.7B_48Layers-Interleaved
Text Generation • Updated • 22 • 2 -
Pretergeek/OpenChat-3.5-0106_8.11B_36Layers-Appended
Text Generation • Updated • 33 • 2