Inference

by tfshen - opened May 24, 2024

May 24, 2024

I notice that the code in inference as follows:
outputs = self.language_model.generate(
input_ids=input_ids, inputs_embeds=inputs_embeds, **gen_kwargs
)
and it doesn't pass in attention_mask and position_ids.
I also notice that the get_masks function in ChatGLM will attend to the padding info.
So whether I should pass the attention_mask and position_ids to the generate function?

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment