Spaces:
Running
Model Request: ByteDance-Seed/Seed-OSS-36B-Instruct
As the title says. I believe it's basically Qwen 2.5.
36B params... I can't even run it right now π
Will look into it
Thanks! I know its at least similar enough that inference frameworks can import it as Qwen 2.5/Llama, but I figured it would be convenient for training.
I can try the conversion script if you want, or even make a tight quant to run.
Sure, if you implement the conversion script I can add you to the llamafy
org if you want to upload it here :)
For Qwen2/Qwen2.5 weights I use this script to convert: https://gist.github.com/fakerybakery/0c296b0f1b595bef2b7417b1f67916f9
Script seems to have worked, thanks! I modified the save function to work with newer versions of transformers:
https://gist.github.com/Downtown-Case/ef9aa9677d68f8eec29a54a35d6445b7
Saved tensors look similar, but I will test if it actually inferences coherently in a bit. If it does, I'll upload here if you add me.
Ha! It's literally token identical, or at least it is loaded via bitsandbytes:
~/AI/scripts
venv β― python test_byte.py
ByteDance-Seed_Seed-OSS-36B-Instruct
The `load_in_4bit` and `load_in_8bit` arguments are deprecated and will be removed in the future versions. Please, pass a `BitsAndBytesConfig` object in `quantization_config` argument instead.
Loading checkpoint shards: 100%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ| 15/15 [00:28<00:00, 1.92s/it]
The following generation flags are not valid and may be ignored: ['temperature', 'top_p']. Set `TRANSFORMERS_VERBOSITY=info` for more details.
<seed:bos>system
You are an intelligent assistant that can answer questions in one step without the need for reasoning and thinking, that is, your thinking budget is 0. Next, please skip the thinking process and directly start answering the user's questions.
<seed:eos><seed:bos>user
How to make pasta?<seed:eos><seed:bos>assistant
<seed:think><seed:cot_budget_reflect>The current thinking budget is 0, so I will directly start answering the question.</seed:cot_budget_reflect>
</seed:think>To make pasta, follow these key steps:
### **1. Prepare the Dough**
- **Ingredients**: 500g (3Β½ cups) all-purpose or bread
venv β― python test_byte_llamafied.py
ByteDance-Seed_Seed-OSS-36B-Instruct-llamafie
The `load_in_4bit` and `load_in_8bit` arguments are deprecated and will be removed in the future versions. Please, pass a `BitsAndBytesConfig` object in `quantization_config` argument instead.
Loading checkpoint shards: 100%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ| 40/40 [00:19<00:00, 2.07it/s]
The following generation flags are not valid and may be ignored: ['temperature', 'top_p']. Set `TRANSFORMERS_VERBOSITY=info` for more details.
<seed:bos>system
You are an intelligent assistant that can answer questions in one step without the need for reasoning and thinking, that is, your thinking budget is 0. Next, please skip the thinking process and directly start answering the user's questions.
<seed:eos><seed:bos>user
How to make pasta?<seed:eos><seed:bos>assistant
<seed:think><seed:cot_budget_reflect>The current thinking budget is 0, so I will directly start answering the question.</seed:cot_budget_reflect>
</seed:think>To make pasta, follow these key steps:
### **1. Prepare the Dough**
- **Ingredients**: 500g (3Β½ cups) all-purpose or bread
I don't understand, why specify a new architecture that's not supported in anything when it's the same as llama?
I sent you an invite, let me know if it works!
I don't understand, why specify a new architecture that's not supported in anything when it's the same as llama?
I think companies often like to do this, e.g. Yi was essentially the same as Llama. Probably looks better for investors to have the architecture set as "seed_oss" rather than "llama" and might cause confusion if people think it is Llama-based. Thought it does come at the tradeoff of ease of use. That is why this org exists!
It works, thanks!
Gotta do something, but will upload conversions of the three models later today.
Yeah, I am handyconstructiongnat. Fair warning, I don't check Discord much... It doesn't even want me to login at the moment, heh.