inference api not working

#3
by llamameta - opened

"The model Nexusflow/Athene-V2-Chat is too large to be loaded automatically (145GB > 10GB). Please use Spaces (https://huggingface.co/spaces) or Inference Endpoints (https://huggingface.co/inference-endpoints)."

Nexusflow org

Thank you! If you want to test it please use try it under direct chat of lmarena.ai (under the name athene-v2-chat)

In free trier, HF only allow at most models 10 GB in size only. This model is 145 GB which cannot be loaded for free tier.
Consider other api providers.

Sign up or log in to comment