Hardware recomendations help

#100

by MLSDev - opened Dec 24, 2024

Discussion

MLSDev

Dec 24, 2024

Hi I want to deploy the model for use test it in a Rag.

what is the gpu ram recommended for inference with 1024 chunk token size?

is cpu usage possible? with my Hetzner dedicated VPS doesn't appear enought

I try to use the api of jina ai, but tokens of my api key seems disapear without my use.

I find xinference like a method to integrate the use, any other recomendation?

Avuja

Dec 26, 2024

I grabbed an Nvidia Orin Nano specifically to be my dedicated embedding/re-ranker
Haven't had it long enough to test a lot but so far works with jina v2 via Ollama well enough
May look into one of those

MLSDev

Dec 26, 2024

thanks for the answer :)

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment