How to use the model?
I tried using the model like so:
from transformers import pipeline
pipe = pipeline("text-classification", model="nothingiisreal/open-gpt-3.5-detector")
print(pipe("For classifying messages written to another human (like emails, texts, or chat messages), the HC3 (Human ChatGPT Comparison Corpus) would likely be your best option among those I mentioned."))
print(pipe("To find a button with the text 'Log in' directly in the browser's DevTools Console, you can use these JavaScript commands:"))
print(pipe("my name is Will"))
the first two are from a chat with LLM and the last is a human conversation
results:
[{'label': 'LABEL_0', 'score': 0.9978427886962891}]
[{'label': 'LABEL_0', 'score': 0.994192898273468}]
[{'label': 'LABEL_0', 'score': 0.9978604912757874}]
I think these are pretty simple use cases so I was surprised with the wrong output.
Am I missing something? should I tokenize the prompts before using the model? if so with what tokenizer?
Thanks in advance!
isn't that what it says on the readme? 0 mean human, 1 mean LLM. 0.9978427886962891 mean 99% sure it's LLM
edit: oh, i was wrong, i thought the formatting that is incorrect
edit: wrong for the second time about the label lol. LABEL_0 and LABEL_1. so 0.9978427886962891 mean 99% sure it's human
yes you are correct, but the first two prompts are copied from an LLM chat and I think they are an easy case but they both got classified as human and with high confidene.
So I wanted to know if some pre-processing needs to occur before applying the model