unifying the input shape of the text-only branch and the text-image branch
For text-only nad text-image forward process takes input of different shape in the in the modeling.py, we should try to keep all data from the dataloader and and remove the text only branches that takes only the first batch from the inner batch.
This way the input of the InternLMXComposer2ForCausalLM.forward
will be universally be (1, bs)
Inside the InternLMXComposer2ForCausalLM.forward
In the image-text mode, interleav_wrap
encodes the ['text_input'] of size (1, bs)
In the text-only mode, ['text_input'] is firstly squeezed into a list of size (bs,) tokenizer
encode the reshaped text inputs.
This change should be made along with https://github.com/InternLM/InternLM-XComposer/pull/410
Here are two references:
https://github.com/InternLM/InternLM-XComposer/issues/408
https://github.com/InternLM/InternLM-XComposer/issues/404