13B 4bits vs 7B 8bits
Hello, I was just wondering what would be the best in terms of speed or quality and what is for you the best compromise.
I use a ryzen 7600x, 16GB ddr4 3200 ram and a 8gb 3060ti
8GB VRAM can fit 13B 4 bit model, but you should prepare to play around a bit.
For example, maybe it will fit a 4_0 comfortably and can do a 400 token input, but completely screw up in a 4_1.
Also I recommend just use instruct mode, or just use cmd, it might also help if you are on Linux/WSL.
At 8GB of VRAM, fitting 13B 4 bit model means you should do everything to save VRAM, every little bits help.
IMO 13B 4 bit would give you better result if it is stable, but if you just want hassle free experience, use 7B 8bit and load a LORA to have some fun :D
Also don't forget about GGMLs, as long as you offload most to GPU, you should get acceptable speed.