generate vs forward

#99
by xyutech - opened

Hello,
I used the model by 2 ways. One is to do inference with generate and another with forward.
Sequences from generate are processed with processor.batch_decode(sequence, skip_special_tokens=False, clean_up_tokenization_spaces=True)and result is pretty reasonable.
When I try to process logits from output of forward method with torch.argmax(logits, dim=-1) and then put it to processor.batch_decode final result is pretty confusing.
May someone shed light on what magic happens at generate instead of torch.argmax?
Thank you!

Sign up or log in to comment