Is it possible to group coordinates per object type using "Point to <something>"?

#3
by hadim - opened

Is it possible to group the coordinates per object type or do we need to run one inference step per object type?

I tried:

  • "Point to object 1 then to object 2 etc"
  • "Point to object 1. Point to object 2. etc"

But none of those works.

The model is able to point to multiple things at the same - can you share the exact prompt you tried & result you got?

"Point to all the black stones on the main go board. Point to all the white stones on the main go board."

I tried many different prompts. Can you share with me one that you know works?

Example attached

Screenshot 2024-09-25 at 2.19.32 PM.png

but maybe not what you meant by grouping

Sorry I should have been more clear. By grouping I mean instead of getting one single <points x1="8.9" y1="92.0" ></points>, I would get one for every objects:

<points x1="8.9" y1="92.0" alt="object1">object1</points><points x1="8.9" y1="92.0" alt="object2">object2</points>

I see yeah I'm not sure that works - @chrisc36 / @sanghol might know!

The standard pointing mode does not support that, however you could try the point-question-answering mode. It can be a bit unreliable but might able to can handle that kind of request with the right prompt. To turn that on prefix you input query with: "point_qa:"

image.png

point_qa: seems to works well. Thank you!

hadim changed discussion status to closed

Sign up or log in to comment