Improve model card: Add Hear-Your-Click context and refined metadata
#63
by
nielsr
HF Staff
- opened
This PR updates the model card for openai/clip-vit-base-patch32
. It clarifies that this CLIP model serves as a crucial component (visual encoder) within the "Hear-Your-Click: Interactive Object-Specific Video-to-Audio Generation" framework.
The changes include:
- Retaining the detailed description of the
openai/clip-vit-base-patch32
model. - Adding a new section that introduces "Hear-Your-Click", its abstract, a link to its paper (2507.04959), and its GitHub repository (https://github.com/SynapGrid/Hear-Your-Click-2024).
- Updating metadata with
license: mit
,library_name: transformers
, and confirmingpipeline_tag: zero-shot-image-classification
. - Adding additional tags like
clip
andvideo-to-audio
for better discoverability and context. - Including the BibTeX citation for the "Hear-Your-Click" paper.
This update provides valuable context for users interested in the applications of this foundational CLIP model.