Ask about the M-RoPE

#25

by JavenChen - opened 8 days ago

Discussion

JavenChen

8 days ago

After I read the code of the implementation of the M-RoPE. Especially the code here:

Do you guys only use hight and width for the rotary position embedding?

The spatial position of the patch at (1,1) in frame 0 and the spatial position of the patch at (1,1) in frame 1—when calculating attention, is their encoded relative distance 0?

JavenChen

6 days ago

My mistake. This rotary_pos_emb is for Vision model. It make sense now!

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

Your need to confirm your account before you can post a new comment.

· Sign up or log in to comment