Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up

dongyh
/
FANformer-1B

Text Generation
Transformers
Safetensors
English
hf_olmo
custom_code
Model card Files Files and versions Community
2
New discussion
Resources
  • PR & discussions documentation
  • Code of Conduct
  • Hub documentation

请问以FAN结构的Tranformer是否存在这种可能性:明标1B实际1B但事实训练的时候,显存或者内存占用等同于3B的占用量?

1
#2 opened about 2 months ago by
abaabbbab

How do you evaluate GSM8K?

#1 opened 2 months ago by
yxli2123
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs