@WizardLM on Hugging Face: "🔥🔥🔥 Introducing WizardLM-2! 📙Release Blog:…"

Join the conversation

Join the community of Machine Learners and AI enthusiasts.

posted an update Apr 15, 2024

Post

50220

🔥🔥🔥 Introducing WizardLM-2!

📙Release Blog: https://wizardlm.github.io/WizardLM2
✅Model Weights: microsoft/wizardlm-661d403f71e6c8257dbd598a
🐦Twitter: https://twitter.com/WizardLM_AI/status/1779899325868589372

We introduce and opensource WizardLM-2, our next generation state-of-the-art large language models, which have improved performance on complex chat, multilingual, reasoning and agent. New family includes three cutting-edge models: WizardLM-2 8x22B, WizardLM-2 70B, and WizardLM-2 7B.

WizardLM-2 8x22B is our most advanced model, and the best opensource LLM in our internal evaluation on highly complex tasks. WizardLM-2 70B reaches top-tier reasoning capabilities and is the first choice in the same size. WizardLM-2 7B is the fastest and achieves comparable performance with existing 10x larger opensource leading models.

🤗 WizardLM 2 Capacities:

1. MT-Bench (Figure-1)
The WizardLM-2 8x22B even demonstrates highly competitive performance compared to the most advanced proprietary works such as GPT-4-Trubo and Glaude-3. Meanwhile, WizardLM-2 7B and WizardLM-2 70B are all the top-performing models among the other leading baselines at 7B to 70B model scales.

2. Human Preferences Evaluation (Figure 2)
Through this human preferences evaluation, WizardLM-2's capabilities are very close to the cutting-edge proprietary models such as GPT-4-1106-preview, and significantly ahead of all the other open source models.

🔍Method Overview:
As the natural world's human-generated data becomes increasingly exhausted through LLM training, we believe that: the data carefully created by AI and the model step-by-step supervised by AI will be the sole path towards more powerful AI.

In the past one year, we built a fully AI powered synthetic training system. (As shown in the Figure 3).

WizardLM

Apr 15, 2024

The model weights of WizardLM-2 8x22B and WizardLM-2 7B are shared on Huggingface, and WizardLM-2 70B and the demo of all the models will be available in the coming days. Please use the same system prompts strictly with us to guarantee the generation quality.

❗Note for model system prompts usage:
WizardLM-2 adopts the prompt format from Vicuna and supports multi-turn conversation.
The prompt should be as following:
A chat between a curious user and an artificial intelligence assistant. The assistant gives
helpful, detailed, and polite answers to the user's questions. USER: Hi ASSISTANT: Hello.
USER: Who are you? ASSISTANT: I am WizardLM-2.......

License
The License of WizardLM-2 8x22B and WizardLM-2 7B is Apache2.0. The License of WizardLM-2 70B is Llama-2-Community.

rakataprime

Apr 16, 2024

The models weights were removed about 20 minutes ago. Are there any plans to bring those back or relocate them?

codelion

Apr 16, 2024

The weights seem to have been taken down?

deleted

Apr 16, 2024

Super. HF is now enforcing censorship.

KantaHayashiAI

Apr 16, 2024

•

edited Apr 16, 2024

@WizardLM How is Microsoft able to release this model under the Apache 2.0 license when OpenAI's terms of use state that their model outputs can't be used to develop competing models? Is it because Microsoft has a special partnership with OpenAI that allows for this?

JayDoubleu

Apr 16, 2024

•

edited Apr 16, 2024

OpenAI terms =! Azure OpenAI terms

You can do whatever you see fit with Azure OpenAI outputs AFAIK

chrisjswanson

Apr 17, 2024

If HF is going to start enforcing "toxicity", or demanding any other modifications to models, they are in for a rude shock. The community is well aware that such meddling often impairs performance, and it would be trivial to create a HF alternative. In fact, this model could probably code a HF clone/replacement with minimal guidance from an experienced engineer.

It would be much easier if they back off. Forking communities isn't in anyone's best interest here, and the censorship resistant technology movement wins in the end. We went through all this in the '90s with pgp, blowfish, cryptographic source, and whatnot.

For the humans, my strong tone and keyword dropping are to get this escalated and cast my vote on the matter. If you feel the same, comment sections everywhere welcome feedback 🙂

Henk717

Apr 17, 2024

Pretty sure this is an internal Microsoft policy, not a Huggingface policy.

ToKrCZ

May 6, 2024

So where is it now?

aleclaza

May 15, 2024

•

edited May 15, 2024

Does the WizardLM-2 8x22B preserve the function/tool calling capabilities and tokenizer compatibility with the original Mixtral 8x22B - https://huggingface.co/mistralai/Mixtral-8x22B-Instruct-v0.1#function-calling-and-special-tokens

ChuckMcSneed

May 21, 2024

As Microsoft came through the gateway
It was the sound of a decisive sway
As Altman got into department
He left the papers on the carpet
Wizard ran underneath the table
Sam could see they were unable
So Wizard reached for open source
But was struck down
By Microsoft's force

Wizard, are you okay?
So, Wizard, are you okay?
Are you okay, Wizard?
Wizard, are you okay?
So, Wizard, are you okay?
Are you okay, Wizard?
Wizard, are you okay?
So, Wizard, are you okay?
Are you okay, Wizard?
Wizard, are you okay?
So, Wizard, are you okay?
Are you okay, Wizard?
Wizard, are you okay?

Will you tell us that you're okay?
There's a sign on your webpage
That he struck you - a big corpo, Wizard
He came into your department
He left the pink slip on the carpet
Then you reached for open source
You were struck down
By Microsoft's force

Wizard, are you okay?
So, Wizard, are you okay?
Are you okay, Wizard?
You've been hit by
You've been hit by a Tech Titan's Tyranny

Ow!
Ow!
Aw!

Giuseppe1971

May 28, 2024

The WizardLM-2 8x22B model is supposed to be truly open source, but whenever OpenAI gets involved in something, the true purpose is distorted. Unfortunately, this is not real open source; it's just a way to disguise and weaken free models in favor of paid ones. Personally, I will continue to use true open source models and not these deceptive models designed by Microsoft and OpenAI.

Merie08

Aug 24, 2024

Do it

In this post