Jamba
https://huggingface.co/collections/ai21labs/jamba-17-68653e9be386dc69b1f30828
please update llama.cpp :)
@mradermacher I updated ouer fork so please update to latest and let's queue all newly supported FalconH1ForCausalLM, JambaForCausalLM and SmolLM3ForCausalLM based models!
@mradermacher
Please mark AI21-Jamba-Large-1.7
, AI21-Jamba-Large-1.6
and AI21-Jamba-Large-1.5
as RPC imatrix tasks. Thouse are 400B models so we should be able to do RPC imatrix computation at full precission but it seams super tight but given that 405B works 400B should fit as well despite them somehow seaming larger. Did we ever add something so I can flag imatrix jobs as RPC by my own or is asking you still the only way?
given that 405B works 400B should fit as well despite them somehow seaming larger.
I guess it's model specific...
Did we ever add something so I can flag imatrix jobs as RPC by my own or is asking you still the only way?
It's on my mental todo for a while now - I think we standardised on the values by now. Unless rpc ips or so change, the only configurable elemnent would be the quant to use (e.g . Q8_0 or, in this case, none)
Please mark AI21-Jamba-Large-1.7, AI21-Jamba-Large-1.6 and AI21-Jamba-Large-1.5 as RPC imatrix tasks.
Done, also, override state is set.
BTW., our babyhercules thread is usually better for this kind of message, as I check that one often when I don't have time for the other discussions.