Using the model locally without calling hugginface API
#20
by
Mincookie
- opened
Hi there, I'm currently attempting to vectorize strings locally by downloading all the files from the hugging face repo. Which JSON file should I be referencing and how should I be setting up the arguments?
I've currently setup a testing scratch file like this:
from sentence_transformers import SentenceTransformer
sentences = ["This is an example sentence", "Each sentence is converted"]
json_file = ["config.json", "data_config.json", "modules.json", "sentence_bert_config.json", "special_tokens_map.json", "tokenizer.json", "tokenizer_config.json"]
for json in json_file:
try:
model = SentenceTransformer(rf'C:\Users\MinCookie\Documents\git_repos\hyperDB\all-MiniLM-L6-v2\{json}')
except Exception as e:
print(e)
print(json)
print("")
continue
embeddings = model.encode(sentences)
print(embeddings)
But all attempts to test the JSON files available in the repository return different error messages:
Unable to load weights from pytorch checkpoint file for 'C:\Users\MinCookie\Documents\git_repos\hyperDB\all-MiniLM-L6-v2\config.json' at 'C:\Users\MinCookie\Documents\git_repos\hyperDB\all-MiniLM-L6-v2\config.json'. If you tried to load a PyTorch model from a TF 2.0 checkpoint, please set from_tf=True.
config.json
list indices must be integers or slices, not str
data_config.json
list indices must be integers or slices, not str
modules.json
Unable to load weights from pytorch checkpoint file for 'C:\Users\MinCookie\Documents\git_repos\hyperDB\all-MiniLM-L6-v2\sentence_bert_config.json' at 'C:\Users\MinCookie\Documents\git_repos\hyperDB\all-MiniLM-L6-v2\sentence_bert_config.json'. If you tried to load a PyTorch model from a TF 2.0 checkpoint, please set from_tf=True.
sentence_bert_config.json
Unable to load weights from pytorch checkpoint file for 'C:\Users\MinCookie\Documents\git_repos\hyperDB\all-MiniLM-L6-v2\special_tokens_map.json' at 'C:\Users\MinCookie\Documents\git_repos\hyperDB\all-MiniLM-L6-v2\special_tokens_map.json'. If you tried to load a PyTorch model from a TF 2.0 checkpoint, please set from_tf=True.
special_tokens_map.json
Unable to load weights from pytorch checkpoint file for 'C:\Users\MinCookie\Documents\git_repos\hyperDB\all-MiniLM-L6-v2\tokenizer.json' at 'C:\Users\MinCookie\Documents\git_repos\hyperDB\all-MiniLM-L6-v2\tokenizer.json'. If you tried to load a PyTorch model from a TF 2.0 checkpoint, please set from_tf=True.
tokenizer.json
Unable to load weights from pytorch checkpoint file for 'C:\Users\MinCookie\Documents\git_repos\hyperDB\all-MiniLM-L6-v2\tokenizer_config.json' at 'C:\Users\MinCookie\Documents\git_repos\hyperDB\all-MiniLM-L6-v2\tokenizer_config.json'. If you tried to load a PyTorch model from a TF 2.0 checkpoint, please set from_tf=True.
tokenizer_config.json
Which JSON should be used and how should the SentenceTransformer arguments be setup if I'm running this locally?
Edit, I found a way to save the loaded model manually before reloading it from the local location.
Mincookie
changed discussion status to
closed