FatemehBehrad
/

Charm

Image Feature Extraction

PyTorch

aesthetics

Model card Files Files and versions Community

FatemehBehrad commited on Mar 11

Commit

83a014f

verified ·

1 Parent(s): 5efefc8

Update README.md

Browse files

Files changed (1) hide show

README.md +19 -10

README.md CHANGED Viewed

@@ -36,14 +36,7 @@ ___
 </a>
 </div>
-```python
-from Charm_tokenizer.ImageProcessor import Charm_Tokenizer
-img_path = r"img.png"
-charm_tokenizer = Charm_Tokenizer(patch_selection='frequency', training_dataset='tad66k', without_pad_or_dropping=True)
-tokens, pos_embed, mask_token = charm_tokenizer.preprocess(img_path)
-```
 Charm Tokenizer has the following input args:
 * patch_selection (str): The method for selecting important patches
   * Options: 'saliency', 'random', 'frequency', 'gradient', 'entropy', 'original'.
@@ -57,11 +50,29 @@ Charm Tokenizer has the following input args:
 * downscale_shortest_edge (int): Used for the 'original' patch selection strategy (default: 256).
 * without_pad_or_dropping (bool): Whether to avoid padding or dropping patches (default: True).
 The output is the preprocessed tokens, their corresponding positional embeddings, and a mask token that indicates which patches are in high resolution and which are in low resolution.
 ___
 * Step 4) Predicting aesthetic/quality score
 ```python
 from Charm_tokenizer.Backbone import backbone
@@ -69,6 +80,4 @@ model = backbone(training_dataset='tad66k', device='cpu')
 prediction = model.predict(tokens, pos_embed, mask_token)
 ```
-**Note:**
-1. While random patch selection during training helps avoid overfitting,for consistent results during inference, fully deterministic patch selection approaches should be used.
-2. For the training code, check our [GitHub Page](https://github.com/FBehrad/Charm/).

 </a>
 </div>
 Charm Tokenizer has the following input args:
 * patch_selection (str): The method for selecting important patches
   * Options: 'saliency', 'random', 'frequency', 'gradient', 'entropy', 'original'.
 * downscale_shortest_edge (int): Used for the 'original' patch selection strategy (default: 256).
 * without_pad_or_dropping (bool): Whether to avoid padding or dropping patches (default: True).
+**Note:** While random patch selection during training helps avoid overfitting,for consistent results during inference, fully deterministic patch selection approaches should be used.
 The output is the preprocessed tokens, their corresponding positional embeddings, and a mask token that indicates which patches are in high resolution and which are in low resolution.
+```python
+from Charm_tokenizer.ImageProcessor import Charm_Tokenizer
+img_path = r"img.png"
+charm_tokenizer = Charm_Tokenizer(patch_selection='frequency', training_dataset='tad66k', without_pad_or_dropping=True)
+tokens, pos_embed, mask_token = charm_tokenizer.preprocess(img_path)
+```
 ___
 * Step 4) Predicting aesthetic/quality score
+  * If training_dataset is set to 'spaq' or 'koniq10k', the model predicts the image quality score. For other options ('aadb', 'tad66k', 'para', 'baid'), it predicts the image aesthetic score.
+  * Selecting a dataset with image resolutions similar to your input images can improve prediction accuracy.
+  * For more details about the process, please refer to the [paper](https://cvpr.thecvf.com/virtual/2025/poster/34423).
 ```python
 from Charm_tokenizer.Backbone import backbone
 prediction = model.predict(tokens, pos_embed, mask_token)
 ```
+**Note:** For the training code, check our [GitHub Page](https://github.com/FBehrad/Charm/).