PerceptCLIP
/

PerceptCLIP_Memorability

computer_vision

perceptual_tasks

Model card Files Files and versions Community

Amitz244 commited on Mar 20

Commit

340b294

·

verified ·

1 Parent(s): 895c691

Update README.md

Files changed (1) hide show

README.md +21 -6

README.md CHANGED Viewed

@@ -11,8 +11,9 @@ tags:
 - LaMem
 - THINGS
 ---
-PreceptCLIP-Memorability is a model designed to predict image memorability (the likelihood of an image to be remembered). This is the official model from the paper ["Don't Judge Before You CLIP: A Unified Approach for Perceptual Tasks"](https://arxiv.org/abs/2503.13260). We apply LoRA adaptation on the CLIP visual encoder with an additional MLP head. Our model *achieves state-of-the-art results*.
 ## Training Details
@@ -23,7 +24,9 @@ PreceptCLIP-Memorability is a model designed to predict image memorability (the
 - *Learning Rate*: 5e-05
 - *Batch Size*: 32
-## Requirements
 - python=3.9.15
 - cudatoolkit=11.7
 - torchvision=0.14.0
@@ -39,12 +42,24 @@ from torchvision import transforms
 import torch
 from PIL import Image
 from huggingface_hub import hf_hub_download
 device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
-# Load model
-model_path = hf_hub_download(repo_id="PerceptCLIP/PerceptCLIP_Memorability", filename="perceptCLIP_Memorability.pth")
-model = torch.load(model_path).to(device).eval()
 # Load an image
 image = Image.open("image_path.jpg").convert("RGB")

 - LaMem
 - THINGS
 ---
+**PerceptCLIP-Memorability** is a model designed to predict the **image memorability** (the likelihood of an image to be remembered). This is the official model from the paper:
+📄 **["Don't Judge Before You CLIP: A Unified Approach for Perceptual Tasks"](https://arxiv.org/abs/2503.13260)**.
+We apply **LoRA adaptation** on the **CLIP visual encoder** and add an **MLP head** for emotion classification. Our model achieves **state-of-the-art results**.
 ## Training Details
 - *Learning Rate*: 5e-05
 - *Batch Size*: 32
+## Installation & Requirements
+You can set up the environment using environment.yml or manually install dependencies:
 - python=3.9.15
 - cudatoolkit=11.7
 - torchvision=0.14.0
 import torch
 from PIL import Image
 from huggingface_hub import hf_hub_download
+import importlib.util
 device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
+# Load the model class definition dynamically
+class_path = hf_hub_download(repo_id="PerceptCLIP/PerceptCLIP_Memorability", filename="modeling.py")
+spec = importlib.util.spec_from_file_location("modeling", class_path)
+modeling = importlib.util.module_from_spec(spec)
+spec.loader.exec_module(modeling)
+# initialize a model
+ModelClass = modeling.clip_lora_model
+model = ModelClass().to(device)
+# Load pretrained model
+model_path = hf_hub_download(repo_id="PerceptCLIP/PerceptCLIP_Memorability", filename="perceptCLIP_Memorability.pth")
+model.load_state_dict(torch.load(model_path, map_location=device))
+model.eval()
 # Load an image
 image = Image.open("image_path.jpg").convert("RGB")