Compatible small models for speculative decoding?
β
1
#9 opened 8 months ago
by
treehugg3

How many GPU ram needed?
1
#8 opened 10 months ago
by
RaidXD

q8 with 8 part
#7 opened 10 months ago
by
sdyy
Q6_K vs. Q5_K_L
3
#6 opened 10 months ago
by
AIGUYCONTENT

Unable to pull in from Ollama
5
#3 opened 11 months ago
by
AIGUYCONTENT

Observation: 4-bit quantization can't answer the Strawberry prompt
π
1
12
#2 opened 11 months ago
by
ThePabli

Nemotron 51B too please
π
8
4
#1 opened 11 months ago
by
nacs