From Ether to Syntax: A Meta-Analytic Exploration of Linguistic Algorithmic Landscapes

by mradermacher - opened May 31

Discussion

mradermacher

Owner May 31

continued....

mradermacher changed discussion status to closed May 31

mradermacher

Owner May 31

•

edited May 31

Here a compleate list of the newly added architectures.

The non-mm-archs are picked up automatically when llama is updated (rather, nothing checks for these archs, other than the script that shows me daily models).

Nice. Will do in caser you forgot any vision/audio architecture.

In case yopu need it, the list/regexc is currently in /llmjob/share/llmjob.pm - search for is_vision

Also, vision is mradermacher code for multi-modal from now on.

Bert based architectures seem to be incredible

I might exclude them from the daily list for that reason, and them being likely not popular with the people who consume ggufs. (and most fail because small models tend to have custom tokenizers).

Nice I just discover an easy way to requeue previously failed archidectures:

Yup, shell-greppable logs for the win.

Update: oh, it's not even the real log file, "just" the llmc why transform of it.

mradermacher

Owner May 31

@RichardErkhov vision models should not be queued to rich1 unless they arte not being detected as such (and then no vision extraction should happen).

The non-vision jobs are limited to 32GB ram, too. No clue what happened. Very troubling.

However, this morning, only besteffort models were queued on rich1. Who knows what nico queued...

RichardErkhov

May 31

well, good to know. usually you take like 4-8gb, but something went wrong today. Peak recorded by proxmox was 24gb (so I assume it was even higher, but due to total OOM, it might not have recorded full number. I added swap on root just in case this happens again so at least other things on server dont die haha

nicoboss

May 31

llmc audit besteffort skips the besteffort models for me.

nicoboss

May 31

Please restart Audio-Reasoner imatrix computation. I killed it earlier today because it ran on CPU. I'm still not sure what makes GPUs occasionally temporary disappear but seams related to them being used on a different container.

mradermacher

Owner Jun 1

llmc audit besteffort skips the besteffort models for me.

Right, arguments were not passed to llmjob audit. Should be fixed now.

mradermacher

Owner Jun 1

@RichardErkhov

Peak recorded by proxmox was 24gb

Well, given that I was officially allowed to use 64GB, 24GB seems absolutely normal. So what is the new limit? 24GB will only allow one quant, and maybe not even that.

423 hidden messages

Expand all

mradermacher

Owner 1 day ago

Even llmc force-restart-imatrix InternVL3_5-241B-A28B is unable to reset it.

I assume that is because you did it too early, while it was still "running". Because it works now.

In general I try to err on the safe side, and force-restart-imatrix onkly works when the job has failed, not when the job is in unknown state, but poossibly still running. I suspect it took ssh a while to retry (probably only due to keepalive, which can take hours).

mradermacher

Owner 1 day ago

Well, found an unrelated bug, when llmc times out (And it currentlyx does because kaos is very overloaded and enabling nico2 requires an rsync of /llmjob), I get this:

Can't locate object method "throw" via package "AnyEvent::CondVar" at /llmjob/share/llmjob.pm line 414.

shame on me, I wrote both sides of this, and couldn't remember that the method is called croak, not throw. :(

Anyway, not sure what to do other than increase the llmc timeout to something obscene. Or get a faster kaos.

mradermacher

Owner 1 day ago

it's so humbling to see a very busy rotating rust box wait 10 minutes to start a simple command

mradermacher

Owner 1 day ago

also, i really underestimated the skipped model list. normally, I go through up to 1000 models each time I do models, which is usually good enough for most days (600-1000 models is typical).

But these models are intense, because apparently most of "uninteresting" models have alrteady been skipped, so I end up with way too many hard to parse model names to quickly go through. Nevertheless,
I went through 16k out of the 34k models so far, so I guess it will be around 1200 models overall.

mradermacher

Owner 1 day ago

ok, everything looks much better now. llmstatusd couldn't update for... at least half an hour.

nicoboss

about 20 hours ago

@mradermacher rich1 is ready for you to use. We created a 12 TiB BTRFS raid0 pool over 3 HDDs from which you can reserve 8 TiB for your scheduler. It really nice to have another worker we can use for large models. We currently assigned 200 GiB of RAM and 256 cores to your container. Please reenable it and have fun. Port 9999 can be used for SSH access using public dynamic IP. Feel free to use rich1 it as hard as you can. Your system currently only supports 1 imatrix node but if you ever add support for a second one, we can add some GPUs to your container on rich1. Only change affecting data is old /tmp being under /tmp2 but other than llmjob_slave.json which I already copied I don't think there is anything else worth moving over.

nicoboss

about 5 hours ago

•

edited about 5 hours ago

@mradermacher Please hardcore InternVLChatModel as vision model no matter what llama.cpp thinks as it is supported. It is one and them not being recognized as such as starting to get quite problematic as they now created to transform many models into hair architecture like InternVL3_5-GPT-OSS-20B-A4B-Preview which all have the same issue.

Please also update to latest llama.cpp of our fork for the highly anticipated Kimi VL vision extraction support, MiniCPM-V 4.5 vision extraction support and interns1-mini support. Let's also retry InternVL3_5-GPT-OSS-20B-A4B-Preview after the update.

Any status update regarding enabling rich1? We currently have some large of high priority models in the queue that would be perfect for it. Please don't gforget to remove the model size limit when rebelling it now that the previous storage constraints are gone. I also recommend doing multiple concurrent tasks as we have plenty of RAM and many CPU cores to keep busy.

nicoboss

17 minutes ago

•

edited 15 minutes ago

@mradermacher Can you even access rich1? I'm not sure if it is still connected to your VPN as when I try to enable it over llmc enable rich1 the SSH connection to it over VPN times out. Maybe this is also just because you disabled it. If I need to do anything on rich1 to fix this or email you its current IP so you can SSH it just let me know.

nico1 ~# llmc enable rich1
rsync /llmjob/llama.cpp-nocuda rich1:/llmjob/.
rich1: Connection timed out
rsync: connection unexpectedly closed (0 bytes received so far) [sender]
rsync error: error in rsync protocol data stream (code 12) at io.c(232) [sender=3.2.7]
ssh: connect to host 10.28.1.7 port 22: Connection timed out

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment