Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up

cloudyu
/
Mixtral_34Bx2_MoE_60B

Text Generation
Transformers
Safetensors
mixtral
yi
Mixture of Experts
Eval Results
text-generation-inference
Model card Files Files and versions Community
16
New discussion
Resources
  • PR & discussions documentation
  • Code of Conduct
  • Hub documentation

From your work, I find a new way to do model ensemble

1
#14 opened about 1 year ago by
xxx1

Adding Evaluation Results

#12 opened about 1 year ago by
leaderboard-pr-bot

The function_calling and translation abilities are weaker than Mixtral 8x7b

1
#11 opened over 1 year ago by
bingw5

Add mixture of experts tag

#10 opened over 1 year ago by
davanstrien

how this model goes work,can you share you idea or traning process? thanks

#9 opened over 1 year ago by
zachzhou

Add merge tag

πŸ‘ 2
2
#8 opened over 1 year ago by
osanseviero

Vram

2
#7 opened over 1 year ago by
DKRacingFan

source code and paper?

πŸ‘ 2
8
#6 opened over 1 year ago by
josephykwang

How does the MoE work?

πŸ‘ 1
3
#5 opened over 1 year ago by
PacmanIncarnate

Quant pls?

6
#4 opened over 1 year ago by
Yhyu13

What is your config?

πŸ‘ 1
1
#3 opened over 1 year ago by
Weyaxi

Should not be called mixtral, the models made into the moe are yi based

πŸ‘ 🀝 18
9
#2 opened over 1 year ago by
teknium

Add merge tags

πŸ‘ 3
#1 opened over 1 year ago by
JusticeDike
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs