GPU for inference

by vt404v2 - opened Jun 15, 2023

Jun 15, 2023

Chat with h2oai/h2ogpt-gm-oasst1-en-2048-falcon-7b-v2 here https://gpt-gm.h2o.ai/ looks very fast. Can you please tell me what GPU you are using for inference? I get about 6.5 tokens/s with 500 tokens prompt and 32 new tokens on A100 80Gb.

ilu000

H2O.ai org Jun 15, 2023

We are hosting the model on a A100 80GB using the awesome inference repository from Hugging Face https://github.com/huggingface/text-generation-inference.
Actually, the GPU is even shared with the other 7B model.

vt404v2

Jun 15, 2023

Thanks, it works for me

vt404v2 changed discussion status to closed Jun 15, 2023

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment