xLSTM-7B-Instruct model

#2
by mrs83 - opened

I noticed that NX-AI’s xLSTM-7b lacks an instruct model variant fine-tuned for instruction-following tasks.

An instruct model would greatly enhance its utility for applications like QA, virtual assistants, and domain-specific tasks.

Would NX-AI or the community consider creating one?

NLP may not be the main focus of NXAI's xLSTM, but it still performs remarkably well.

https://huggingface.co/mrs83/FlowerTune-xLSTM-7b-NLP-PEFT

With this adapter, the base model reaches 15.35% average accuracy on NLP tasks (STEM, Social Sciences, Humanities) after instruction fine-tuning, using a small portion (25%) of the vicgalle/alpaca-gpt4 dataset as training data.

I am wondering what could be achieved by training a full instruct model. Could it surpass models like mistralai/Mistral-7B-Instruct-v0.3?

Thanks Massimo, for keeping a glimmer of hope with your contribution, practically unnoticed and without a comment for over a month!
With baited breath I've been following Sepp's big xLSTM talking for, guess, well over a year. (if not 2 or 3)
By YouTube chance I found this desert here, while I'm scrambling to keep up with practical announcements from MS (Github Copilot), OpenAI o1, DeepMind Gemini 2 and Chinese DeepSeek R1 and such.

Wondering if I should waste time beyond this message with a 7B model in F32 and no quantization that doesn't fit on any consumer hw. (Guess it would fit on my 2x20GB RTX 4000 Ada though.)
Disappointed
G.

Sign up or log in to comment