Image Labels: One-shot image-conditioned object detection

by godaspeg - opened Jun 26

Jun 26

Is it possible to detect objects using images as labels instead of texts? As OwlVIT is based on CLIP Embeddings, I think this should be theoretically possible.

godaspeg changed discussion title from Image Labels to Image Labels: One-shot image-conditioned object detection Jun 26

nielsr

Jun 27

Yes, image-guided object detection is supported, see the demo notebook: https://github.com/NielsRogge/Transformers-Tutorials/blob/master/OWLv2/Zero_and_one_shot_object_detection_with_OWLv2.ipynb

godaspeg changed discussion status to closed 28 days ago

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment