Text Classification
Transformers
PyTorch
English
llama
text-generation-inference

How do I use this model?

#3
by treehugg3 - opened

I'm getting logits that look like this after running the model:
[-3.492682456970215,3.688016653060913]

How do I interpret them? Do I just run a softmax and learn that the reward model likes this response? I wonder how I should compare two different answers. Thank you!

Sorry, I made a mistake and was running some kind of distilbert model when I meant to be running this model. The resulting single logit like 1.7047538243223 is returned, and a higher number means the reward model favors it more than another output.

Sign up or log in to comment