Inference

#1
by tfshen - opened

I notice that the code in inference as follows:
outputs = self.language_model.generate(
input_ids=input_ids, inputs_embeds=inputs_embeds, **gen_kwargs
)
and it doesn't pass in attention_mask and position_ids.
I also notice that the get_masks function in ChatGLM will attend to the padding info.
So whether I should pass the attention_mask and position_ids to the generate function?

Sign up or log in to comment