doubt about 8K models with Amd graphic rx amd 6950 xt
Hello, I'm new to all this about models and pygmalion, although I'm already joined to discords, the fact is that I recently bought a Radeon AMD rx 6950 xt graphics card and I've encountered the problem that people say that Exllama only It works on nvidia or on AMD with Linux, I don't use Linux, I have windows 10 home official bought, I would like to ask if I can run some of your models with 6k tokens or 8k tokens even without using Exllama with my 16 vram and my 32 gigabytes of ram in windows 10 with my AMD graphics card?, in the event that it is required in a mandatory way use Exllama, can you tell me if there is any way to make it work?, are you working on a version that works in windows for AMD??
GPTQ models don't work on AMD on Windows, no. Nothing I can do about that, only AMD can change that by releasing their ROCm code on Windows, which so far they haven't done.
But GGML models also provide GPU acceleration and they work with AMD GPUs on all platforms. You're commenting on a GGML model here. Check this README for details on using KoboldCpp, and read its instructions, and then use it with OpenCL acceleration. That'll give you AMD GPU acceleration with up to 8k context size.