dim is 1024?
#5
by
chaochaoli
- opened
The 1024 dimension consumes a lot of memory for a large amount of data.
ok,can use matryoshka_dims?
Hello!
Yes, you can use Matryoshka-style embedding truncation to reduce the memory usage/disk space of your embeddings. See the second code block here for the usage via the truncate_dim
parameter: https://huggingface.co/sentence-transformers/static-similarity-mrl-multilingual-v1#direct-usage-sentence-transformers
And this shows roughly the performance that you might expect when using specific truncation dimensions: https://huggingface.co/sentence-transformers/static-similarity-mrl-multilingual-v1#matryoshka-evaluation
- Tom Aarsen
chaochaoli
changed discussion status to
closed